Structure Editors for Wikilon

I have a new design that vastly improves upon my older extensible syntax and embedded literal objects approaches, and brings Wikilon many steps closer to my long term HCI goals for Awelon project. And it’s surprisingly simple!

Every word in the Wikilon dictionary is now defined by a pair: a compiler function, and a structured value, such that compiling the value returns a block – a first-class function – which is accepted as the meaning of the word. Both elements of this pair are encoded in Awelon Bytecode (ABC), albeit with access to the dictionary via unforgeable `{%foo}` tokens.

Structured values in ABC are constructed of only a few basic primitives:

  • numbers – rational, precise (unless weakened)
  • products (a*b) – pairs of values
  • sums (a+b) – ~ Haskell Either type
  • unit (1) and void (0) – structured identities
  • blocks [a→b] – first class functions
  • discretionary sealers {:fooType} – ~ Haskell newtype

These are the basic types. Floating point numbers might be requested via annotations, e.g. {&float}. Strong, cryptographically sealed values may be accessible via side-effects. But we don’t really need to consider anything beyond the basics for our dictionary and compilers.

The simplest possible compiler function is the identity function, which works when the structured value for the definition is just a block of code.

:swap [rwrwzwlwl]
:dup [r^zlwl]
:swapd [rw {%swap} wl]
:rot [{%swapd} {%swap}]
:dupd [rw {%dup} wl]
:over [{%dupd} {%swap}]

Conveniently, this trivial ‘identity function’ compiler corresponds closely to the (now deprecated) Awelon Object (AO) code. Compare AO code as it exists today:

@swap %rwrwzwlwl
@dup %r^zlwl
@swapd %rw swap %wl
@rot swapd swap
@dupd %rw dup %wl
@over dupd swap

To directly read and write this raw ABC in place of AO code is doable. Indeed, I’ll probably do some of that when getting started in this new model. However, the growing noise from `{%foo}` wrappers and inconvenient number encodings (e.g. `#1234#1000/*l` is the ABC equivalent to `1.234` in AO) would grow wearisome. Fortunately, in context of a structure editor, I think we could present blocks as an AO variant to the user, even rendering `1.234` as appropriate, and perhaps fading out raw data plumbing code. And hyperlinking words, of course. 🙂

But we aren’t limited to the trivial compiler. We can use arbitrary, ad-hoc structures to represent our word definitions… and we can render the definitions appropriately, i.e. using lists and dictionaries and so on.

The `{:fooType}` discretionary value sealers provide a convenient mechanism to denote that a sealed substructure might benefit from a special rendering and editing widget. This can be very ad-hoc, driven by convention and reflection. E.g. if we want markdown text to use special syntax highlighting and editing tools, we could model this as a text literal followed by `{:md}`, and allow our editors to know what is expected in this case or reflect back into the dictionary searching for words named something like `edit:md`. This should achieve most of the advantages of embedded literal objects, while avoiding most of the weaknesses.

This new approach has many other important advantages:

First, the use of `{%foo}` tokens to access words makes trivial a global renaming of words. Simple renaming is actually what led me to this design. My earlier approach to extensible syntax left word-recognition to a user-defined parser function, which made it too difficult to recognize, refactor, or replace words in a more generic sense.

Second, definitions modeled this way are very scalable – i.e. because it’s easy to use generic mechanisms for compression, linking, automatic refactoring, structure sharing.

Third, it’s easy to save arbitrary data structures into the dictionary to simplify the flow of application-generated data (or even whole application-snapshots) back into reusable code.

Fourth, it’s easy to reuse compiler functions, or even embed them (e.g. constructing a pair of a foo value and the foo compiler), when modeling higher level languages. Thus, languages are readily compositional.

Fifth, with respect to my long term HCI goals, it’s easy to model user input as streaming code, functions that operate on the structured value. There is no need to repeatedly parse and pretty-print. We can abstract a stream into new user macros and editing tools, or as ad-hoc reusable functions.

This technique is, in a broad sense, similar to storing an AST in the database. But it also differs in ways that I consider interesting: opacity of block structures to the compiler function, easy support for user-macros to construct and manipulate the structured values, absence of predefined structure, and easy structured reuse of languages. I believe this new approach to extensible syntax and structured editing will prove a good long-term fit for Awelon project.

Addendum 20150302: We can use metadata to associate a compiler function with each word; however, metadata is relatively inaccessible. An interesting alternative is to push the compiler function into the structure generated by each word, e.g. such that definitions have type `type Def a b = ∃v.∀e. e → ([v→[a→b]] * (v * e))` and an example might appear as `:fooExample mkFooStruct [{%fooLang}]`. This technique is simple, universal, and offers developers much ability to directly manipulate and refactor language (how a structure is interpreted) in addition to the structure. The simplest compiler function is still identity, e.g. `:dup [r^zlwl] []`.

This entry was posted in Language Design. Bookmark the permalink.

5 Responses to Structure Editors for Wikilon

  1. Kyle Blake says:

    Interesting. Does something like {&float} not violate the no-semantic-effect rule for annotations, since it introduces rounding after every operation? Would an ABC compiler have to understand {%foo} references (and thus know how to look up things from the dictionary), or would the equivalent of “ao abc.ann” perform the inlining automatically?

    • dmbarbour says:

      Wikilon would inline the definition of {%foo} before passing it on to the ABC compiler.

      When a human says {&float}, I think they’re saying “this `1.03 volts` number wasn’t exact anyway, so please don’t make me pay for exact precision arithmetic.” Whether it violates semantics perhaps depends on whether you’re referring to human’s meaning or the machine’s observable behavior.

      Ideally, I’d prefer to explicitly model floating precision numbers, then (via annotations) tell the compiler to recognize and replace the super-slow explicit models with super-fast hardware implementations. This would offer a good formalism for floating point computation. But that won’t happen for a long time. For this kind of accelerator, collections-oriented programming (vectors, matrices) has higher priority.

      Meanwhile, I’m willing to play a little fast and loose with {&float} if it means I’m significantly more performance competitive in the short term.

      • Kyle Blake says:

        OK, cool. Yeah, that’s fair enough. The idea of explicitly modelling fp numbers is quite interesting.

  2. Pingback: Abstract Awelon Virtual Machines | Awelon Blue

  3. Pingback: Command Language for Awelon | Awelon Blue

Leave a comment