Notes
2025/11/28

Partial evaluation in Kombucha

I've recently been thinking about how Kombucha's two-stage execution model works and whether it makes sense to keep both stages separate or try to unify them. This came up because Kombucha now has both a partial evaluator and a bytecode VM, and the relationship between these two systems is worth exploring.

Why partial evaluation matters for macros

The main reason Kombucha needs partial evaluation is macros. When you write a macro in Kombucha, the macro system transforms your code before it gets compiled. This transformation introduces overhead: the macro machinery itself creates intermediate data structures, bindings, and function calls that exist only to support the transformation process.

Without partial evaluation, all of this macro overhead would remain in the compiled code. Every time your program runs, it would be executing not just your intended logic but also the scaffolding that was only needed during macro expansion. Not the best approach.

Partial evaluation solves this by running the macro transformations ahead of time. The partial evaluator executes the macro code once, evaluates away all the intermediate structures, and produces simplified code that contains only what you actually wanted to compute. The result is that macros become zero-cost abstractions: they give you expressive power at compile time without any runtime penalty.

This is particularly important in a language like Kombucha where macros are a first-class feature and are even used for variable assignments and function definitions. If every use of a macro carried runtime overhead, Kombucha's syntactic flexibility would come at a pretty steep price. (In fact, this is why I avoided relying on previous definitions in the current Kombucha prelude, which makes the code really ugly and hard to follow.)

Two evaluators

I'm currently experimenting with two separate systems for executing code:

These two systems are conceptually doing the same thing (evaluating Kombucha expressions), but they do it in different ways. The partial evaluator works directly on the intermediate language representation, while the bytecode VM executes a lower-level instruction set.

Having two evaluators means maintaining two separate implementations of Kombucha's semantics. Any change to the language needs to be reflected in both places.

Could these two evaluation strategies be unified? Could we have a single interpreter that handles both partial evaluation and runtime execution?

There's a certain appeal to this idea. If we had one interpreter, we'd only need to implement language semantics once. Partial evaluation would just be running the interpreter on code where some inputs are known and some aren't. The interpreter would evaluate what it can and leave the rest as expressions to be evaluated later.

However, there are good reasons to keep them separate:

For now, I'm keeping the two systems separate. The partial evaluator handles macro expansion and other compile-time optimizations, producing simplified code. The bytecode compiler then translates this simplified code into an efficient instruction set, and the VM executes it.

The current challenges

Unfortunately, the current partial evaluator doesn't actually work. It has a tendency to overflow the stack during compilation, which is a pretty fundamental problem. The issue is that recursive partial evaluation is really hard to get right. When you're trying to partially evaluate recursive functions, you can easily end up in situations where the evaluation blows up exponentially.

The problem is that partial evaluation isn't just call-by-value lambda calculus evaluation. In CBV lambda calculus, you evaluate arguments before passing them to functions, and that's it. You're always making progress toward a final value. But partial evaluation needs to handle cases where some values are known and others aren't. You might partially evaluate a function call where only the function is known, or where only some arguments are known, or where you need to evaluate the function body up to a certain point and then stop.

This creates all sorts of edge cases. When do you stop evaluating? If you're evaluating a recursive function and some arguments are unknown, do you unroll it once? Twice? Not at all? If you unroll it, do you then try to partially evaluate the recursive call? And what if that recursive call leads to another recursive call with slightly different known values? You can easily end up in a situation where the partial evaluator keeps unrolling and simplifying in an infinite loop, or where it produces exponentially larger code at each step.

The classic approach to this problem is to track which expressions you've already seen and avoid re-evaluating them. But that's tricky when you're dealing with functions that capture variables from their environment, because the same function with different captured values is really a different function. And tracking all of this state adds complexity to an already complex system.

Restricting to comptime functions

One solution I'm considering is to restrict partial evaluation to explicitly annotated comptime functions. Instead of trying to partially evaluate everything during compilation, only evaluate functions that are specifically marked as compile-time only. This would make the partial evaluator's job much more predictable.

This approach sacrifices some of the potential power of partial evaluation. Ideally, the compiler would automatically figure out what can be evaluated at compile time and do it without any annotations. But given the difficulties in making that work reliably, explicit annotations might be a reasonable compromise. It would at least make macro expansion work correctly, which is the main reason for having partial evaluation in the first place.