Dr. Paul Bone
paul.bone.id.au
I these slides use some JavaScript and WebAssembly in addition to the reveal.js library. I tested this with Firefox 63, I did not test on other browsers or attempt to polyfill things (such as arrow functions) for older browsers. If it doesn't work try a more recent browser.
I have added some exposition slides, to help make reading the slides without the presentation clearer.
Press SPACEBAR to go to the next slide / reveal fragments.
Some images are fairly obviously not my own work. Diagrams, code and prose are Copyright (C) 2018 Paul Bone. You may view this information, but may not duplicate it, pass it off as your own work etc.
Compilers are made of a pipeline of operations, transforming different data structures.
This talk is about code generation.
People assume it's hard.
Or worse, eldritch knowledge kept secret by a sacred order of complier authors.
1992 — Teen Talk Barbie
"Math class is tough"
1994 — Lisa vs Malibu Stacy
Malibu Stacy:
"Thinking too much gives you wrinkles"
Lisa Lionheart:
"Trust in yourself, and you can achieve anything"
The point I'm trying to make with these slides and what I was saying verbally; is that we shouldn't tell ourselves "X is hard" because if we do, then we'll never attempt it. In some cases, this can even lead to gross misunderstandings held about a topic.
Saying something is easy can also be a problem.
What I'm getting at is that it's not as hard as you may think. It's achievable.
I also like to say that it's fun.
These are all deep topics, but you can learn the basics in an afternoon.
The basics are fairly basic, but the depths can get really deep.
Stack |
push 32;
push 35; // c
push 9;
mul;
push 5;
div;
add;
i32
,
i64
,
f32
,
f64
and
function types2's complement arithmetic:
i32
The relational operators (equal, not-equal, less-than etc)
return i32
regardless of their input type.
Conditional branch consumes an i32
// x (64-bit) is on the stack
i64.const 5
i64.lt_s // (i64 i64 - i32)
if ... // (i64 -)
Control flow is structured
This is fairly unusual for an IR, it allows:
break
and continue
but not arbitrary goto
.
call_indirect x
where x is a function type.
i32
from the top-of-stack, let it
be aProbably coming in a future version:
Are being considered / might be considered:
I don't work WebAssembly, including it's specification.
The next slides show a Hypothetical language. This is designed to be as simple as possible in order to demonstrate code generation. This is not supposed to be a good code generator but a basic code generator.
Abstract Syntax Tree
data Func = Func {
name :: String,
args :: [String],
body :: Expr
}
data Expr = BOp Op Expr Expr
| Let String Expr Expr
| Case Expr [PatExpr]
| Call String [Expr]
| Var String
| Lit32 Integer
data Op = Add
| Subtract
| ...
| LessThan
| LessThanOrEqual
| ...
| NotEqual
| Equal
data PatExpr = PatExpr Pattern Expr
data Pattern = Number Integer
| Wildcard
Example program
Let "c" (Lit32 35)
BOp Add
(BOp Div
(BOp Mul (Lit32 9)
(Var "c"))
(Lit32 5))
(Lit32 32)
What it doesn't have
- Only one data type 32-bit signed integer.
- Higher order values / calls.
- Closures
- Any data structure, eg: cons cell
Generating WebAssembly
Visit each node depth first, generate instructions.
i32.const 9
i32.get_local 0
i32.mul
i32.const 5
i32.div
i32.const 32
i32.add
How did we know where this variable was?
Allocate locals
data Symtab a = Symtab {
symbols :: M.Map a Int,
next_id :: Int
}
build_locals :: Ast.Expr -> State (S.Symtab String) Ast.Expr
build_locals (Ast.Let var0 let0 in0) =
do do_rename <- s_member var0
(var, in2) <- if do_rename
then renaming
else do s_new var0
return (var0, in0)
let_ <- build_locals let0
in_ <- build_locals in2
return (Ast.Let var let_ in_)
- I have no duplicate variables,
this could have been more optimal.
- Add each variable to the
symbol table.
Get and set locals
Get
Var x
:
- Lookup the var in the symbol table.
- Generate
get_local id
.
- Which will read the local and push its value onto the
stack.
Set
Let x e1 e2
:
- Generate the code for e1.
- Lookup the var in the symbol table.
- Generate
set_local id
.
- Which will take the result of e1 from the stack
and save it to the local
- Generate the code for e2.
Case
...
get_local 1 // Step 1.
set_local 2 // Step 3.
get_local 2 // Step 4.1
i32.const 1 // cont.
i32.eq // cont.
(if (result i32)
(then
(i32.const 1)) // 4.2.
(else // 4.1.
(get_local 0) // 4.2.
(i32.const 1)
i32.sub
...)
)
WebAssembly makes this very easy with its nested blocks.
Case expr pats
- Generate the code for expr
- Create a new variable.
- Generate:
i32.set_local v
.
- for each pattern:
PatExpr pat
expr
- Generate test code for pat.
- Generate code for expr.
- Generate:
unreachable
Exposition: This slide kind-of has a
bug. I may fix it.
Demo!
Status:
Input:
Output:
Source:
-- Sorry, the syntax doesn't actually support
-- multi-line functions.
public fib n = let test = n < 2 in case test of
1 -> 1
0 -> fib(n - 1) + fib(n - 2)
WebAssembly Text Format (WAT)
(func (export "fib")
(param i32)
(result i32)
(local i32 i32)
(get_local 0)
(i32.const 2)
i32.lt_s
(set_local 1)
(get_local 1)
(set_local 2)
(get_local 2)
(i32.const 1)
i32.eq
(if (result i32)
(then
(i32.const 1))
(else
(get_local 2)
(i32.const 0)
i32.eq
(if (result i32)
(then
(get_local 0)
(i32.const 1)
i32.sub
(call 2)
(get_local 0)
(i32.const 2)
i32.sub
(call 2)
i32.add)
(else
unreachable))))
return)
Thank you
- This code generator:
-
github.com/PaulBone/ast2wasm
- Me:
- paul.bone.id.au /
@Paul_Bone
Code generation is challenging
/ rewarding
/ fun.