Notes
2025/12/07

Colons for bindings, macros, types and keywords

As part of sketching out Kombucha's type system, I've been thinking about how to reduce the visual noise in Kombucha's syntax, primarily in two areas: macros and keyword calls.

Bindings and macros

Right now, Kombucha's macro system requires every binding (every “variable-to-be-bound”) to be marked using an explicit prefix colon, both when variables are to be bound in the enclosing scope and in an explicit {...} block:

// `p` will be bound
:p = Point(x, y)

match (p) [
    // `x` and `y` will be bound
    Point(:x, :y) -> {
        // ...
    }
]

Peppering the code with : prefixes isn't exactly pretty. It would be possible to infer from context that x and y are bindings in the pattern matching example, because a {...} block appears on the right hand side of ->. However, this is harder to do for something like assignment using =, because p will be bound in the enclosing scope.

Not all statements in the outer scope bind something though, some just cause side effects:

// `=` binds the value bar to foo
:foo = bar

// print presumably has a side effect
print(foo)

These two cases need to be desugared differently, which is why Kombucha's current syntax requires the : prefix for :foo, so that it's clear that = acts as a macro and not as a side effect.

Is there a better solution for the enclosing scope that could infer that foo is supposed to be a binding, similar to the pattern matching case?

One option is to assume that all identifiers that appear to the left hand side of an infix function are to be bound, unless the whole expression is explicitly marked as a side effect, for example using ;. (If identifiers on the left hand side of an infix function are to be used as values, they can be explicitly resolved using the pin syntax ^foo.)

The above examples would then look as follows:

// `p` will be bound
p = Point(x, y)

match (p) [
    // `x` and `y` will be bound
    Point(x, y) -> {
        // ...
    }
]

// `=` binds the value bar to foo
foo = bar

// print has a side effect
print(foo);

This has a few drawbacks:

The macro system would be restricted to infix functions. Additionally, bindings can only appear as the left hand side argument of an infix function. (But separate syntax could later be added for explicitly calling prefix functions as macros.)
The ; suffix is necessary to mark an expression as having a side effect, even in inline contexts separated with commas, leading to expressions such as { foo = bar, print(foo);, baz = qux }. (This could be solved by breaking the assumption that newlines and commas are always interchangeable.)

An even more restrictive alternative would be to assume that (inside of blocks) infix functions are always macros that bind variables, whereas prefix functions always have side effects.

Types and keywords

Kombucha currently uses postfix : for Ruby/Elixir-inspired keyword lists: A keyword is any identifier followed by a colon. Prefix function calls can have trailing keyword arguments, which are pairs of keywords and expressions:

// This keyword list...
if (x == y) do: {
    print("true")
} else: {
    print("false")
}

// ...is desugared to:
if(
    x == y,
    [
        ["do", { print("true") }],
        ["else", { print("false") }]
    ]
)

This also makes for a nice data language, because atoms (identifiers starting with an uppercase letter, acting as interned strings) support this syntax as well:

// This...
Link href: "http://example.com" title: "Just an example"

// ...is desugared to:
Link([
    ["href", "http://example.com"],
    ["title", "Just an example"]
])

It is sometimes nice to write these keyword calls on multiple lines, which is why Kombucha interprets an expression foo: bar as ["foo", bar] and also allows [...] to appear as a trailing argument:

// This...
Link [
    href: "http://example.com"
    title: "Just an example"
]

// ...is desugared to:
Link([
    ["href", "http://example.com"],
    ["title", "Just an example"]
])

But it would be nice to use a postfix : for type annotations, as this has become the standard syntax in most modern languages. While this might be compatible with keyword function calls (where keywords only appear after the initial function identifier, never as the first part of an expression), it would however conflict with the above syntax, because x: y could either stand for ["x", y] or the value x followed by the type y.

What's the best way to solve this? There are a few options:

Continue using : for keyword arguments, use a different syntax for type annotations, such as ::. This would work, but feels quite unfamiliar coming from other languages. Additionally, type annotations will likely be used more often than keyword arguments and should have a succinct syntax.
Use : for types, use a different syntax for keyword arguments. It is unclear what kind of syntax would make sense for keyword calls though.
Use : for both keyword arguments and types, drop support for x: y as sugar for ["x", y]. This would work, but would make the data language considerably less elegant and force all keyword arguments to appear on the same line.
Use : for both keyword arguments and types, disambiguate based on the context. Types can only appear after bindings and bindings can only appear in limited contexts where the macro system supports them. However, there are context such as patterns where both type annotations and the sugar for keyword pairs would make sense.

Preliminary decisions

Here's what I'm leaning towards, for bindings, macros, types and keywords:

Drop prefix : for bindings completely, disambiguate based on context, only allow macro arguments on the left hand side of infix expressions. This would restrict macros quite a bit, but perhaps that's fine?
Don't use ; for sequencing, instead consider all prefix function calls (possibly with keyword arguments) to have side effects, consider all infix calls to be macros that bind variables.
Use : for trailing keyword arguments after prefix function calls, just as Kombucha does right now. This makes it possible to build most language constructs.
Use : for types, because it is the most familiar syntax. Type annotations are only allowed after bindings, which are determined by the macro system.
Disallow : for pairs, so that x: y never desugars to ["x": y] and can only be used for type annotations.

I'm not happy about the last decision, but making x: y context-dependent would be both confusing and require the parser to be aware of the macro desugaring, which I've been trying to avoid until this point.

Those are the preliminary decisions. Let's see whether that combination works in practice.