Another week has flown by at the Recurse Center, where I managed to tie up a couple of loose threads related to Vorpal, the programming language that I'm building.
Vorpal is now in a state where I can use it to write programs with a syntax close to what I'm eventually envisioning. This requires a bit of bootstrapping, as Vorpal's core language does not provide any operator for deep equality, let alone anything resembling pattern matching. This is by design, however, because one of the goals is to be extensible enough that such language constructs can be added in the language using macros.
That's what I did this week, going from a level of abstraction similar to lambda calculus to pattern matching with unification. Here's the end result:
match: Pair(Foo, Bar) with: [
Pair('x, 'x) -> { Twice(x, x) }
Pair('x, Foo) -> { SecondIsFoo }
Pair('x, 'y) -> { y }
'_ -> { throw!(InvalidPair) }
]
The prelude (the small set of functions that are by default included in every Vorpal program) now defines the two functions “match-with” (which can be called using a Smalltalk-like keyword call syntax as “match: ... with: ...”) and “->” (which expects a pattern and a body and returns a function that takes a value and executes the body if the pattern matches the value).
I settled on an instruction set for Vorpal's stack-based bytecode VM. It's definitely not the most efficient one, but simple to implement, which is more important for my purposes. There are currently 11 instructions (with plans to stick to a maximum of 16 instructions in the long term) with variable lengths ranging from 1 to 4 bytes:
| NAME | BYTE 1 | BYTE 2 | BYTE 3 | BYTE 4 |
+------------+----------+----------+----------+----------+
| LOAD_FN | TAG | --FN_INDEX-------------- | --VARS-- |
| LOAD_STR | TAG | --STRING_INDEX---------- |
| LOAD_EFF | TAG | --STRING_INDEX---------- |
| LOAD_VAR | TAG | --VAR_INDEX-- |
| APPLY_PRE | TAG | -- |
| APPLY_POST | TAG | -- |
| RETURN | TAG | -- |
| CMP | TAG | -- |
| UNPACK | TAG | -- |
| TRY | TAG | -- |
| UNWIND | TAG | -- |
Here's what the instructions do:
I got rid of a special instruction for recursion. There is still a recursion combinator in the intermediate (lambda-like) calculus, but the combinator is then compiled into a simple fixed-point function using the following code, which is guaranteed to be placed at offset 0 in the list of operators:
// fix = f => x => f(fix(f))(x)
let mut bytecode = vec![
// 0: f => ...
Op::LoadFn { code: 2, fvars: 1 },
Op::Return,
// 2: ... x => f(fix(f))(x)
Op::LoadVar(1), // f
Op::LoadFn { code: 0, fvars: 0 }, // f, fix
Op::LoadVar(1), // f, fix, f
Op::ApplyPrefix, // f, fix(f)
Op::ApplyPrefix, // f(fix(f))
Op::LoadVar(0), // f(fix(f)), x
Op::ApplyPrefix, // f(fix(f))(x)
Op::Return,
];
Next week I want to...
Want to become a better programmer? Join the Recurse Center!