Compiling ABC/AO

I’ve spent most of this month developing a more sophisticated compiler for Awelon project, based on first translating the code into a dataflow graph, then translating to Haskell. This allows easy use of Haskell’s variables, and more effective use of pure subprograms. Which, presumably, GHC knows how to optimize. At the moment, runtime performance only slightly exceeds interpreted performance for a simplistic microbenchmark. But I have yet to implement a few simple optimizations on the intermediate graph. Progress on the compiler has been painfully slow, involving a lot of backtracking and exploration of designs (though it’s only a couple thousand lines of code). Anyhow, this post isn’t primarily about performance or progress.

I’ve been thinking about how I should expose compilation to programmers. This post is just me thinking out loud. You’re welcome to join in.

Currently, I support dynamic compilation by use of a `{&compile}` annotation. Annotations must have identity type and no effect observable within the program, but are able to support debugging or tweak performance – e.g. parallel evaluation, or compilation of a block, or runtime assertions. In this case, obviously, we’re compiling a block. But use of an annotation is very explicit, and can be somewhat awkward – e.g. requiring constructs like `[foo] compile inline` instead of just `foo`.

There is a good case for explicit dynamic compilation, especially in context of DSLs and metaprogramming. Leaving these to implicit just-in-time (JIT) compilation has always seemed awkward to me; often the programmer does know best. But there is also a good case for implicit compilation, based on traces or hotspots or similar.

At the moment, in context of Awelon Bytecode (ABC), I’m not sure how to go about implicit JIT. In conventional languages, source tends to have a stable name or location in memory, which is easily used to keep a counter to identify hotspots or traces. ABC is designed for streaming code, and ABC blocks don’t really have any name or identity beyond their structure, and loops often involve compositions of code. Awelon Object (AO) code does support names (it’s essentially a macro language for ABC) but still doesn’t have recursive loops. (I’m also hesitant to compile AO directly, because development in that direction might hinder development for streaming bytecode.)

One possibility is to leverage a streaming compression and dictionary building algorithm to gradually identify long sequences of ABC code that are seen often, then compile those common sequences to support implicit JIT. I feel this is worth exploring, but it also seems like a lot of work for potentially marginal gains.

Anyhow, I’m also interested in static compilation, and easily mixing static and dynamic components. Dynamic compilation is useful in this endeavor, but I feel there is more I could do. This morning, I had another idea (which should have been obvious in retrospect):

I can use naming conventions in the AO dictionary to drive static compilation.

AO already uses naming conventions to support automatic testing. (Words prefixed with `test.` become automatic tests.) Gradual development of de-facto naming conventions – e.g. to support typechecking, documentation, property testing, optimizations or rewrite rules, exporting applications – has been part of AO’s design from very near the beginning. The idea is that naming conventions do not change the formal meaning of a word (the macro expansion into ABC) but they do suggest contexutal application from an IDE or larger system.

The idea is that I could use a prefix such as `#` or a suffix such as `.exe` to suggest that words of the form `#foo` or `foo.exe` should be statically precompiled (in a runtime where this is enabled) for use within the runtime. Whether this is a good idea, or ultimately just as awkward as explicit dynamic compilation, remains to be seen. But it does appeal to me. It allows appropriately coarse-grained compilation for reusable components, and doesn’t hinder performance when interpreting code. It can also enable effective separate compilation, reuse, and code sharing (which might result in better cache behavior).

A potential advantage of this `#foo` approach is that it mitigates a weakness of AO: exponential growth of generated code.

The formal semantics of AO is simply the inline expansion of each word’s definition until there are no more words. All that remains, at that point, is an ABC stream. Keeping these extremely simple semantics is why AO forbids recursive function definitions – there would be no limit to simple inline expansion. Loop behavior is instead expressed non-recursively by use of a fixpoint combinator (which admittedly isn’t very convenient, but developers can build while/until/foreach/etc. loops above fixpoint).

Anyhow, this inline expansion means that generated bytecode tends to grow exponentially in size with depth and layers of abstraction. This can be partially mitigated for streaming code because we can expand lazily. But not all Awelon project subprograms are streamed, and in general exponential expansion can stress system memory and CPU cache. By utilizing precompiled words at expansion time (as opposed to dynamic compilation, which applies after expansion) it seems reasonable that stress on memory and cache could be reduced. Further, I could integrate this with ABC’s own approach to separate compilation and linking.

Awelon project has a non-conventional and interesting approach to separate compilation, linking, and resource sharing:

Instead of importing human-named modules and functions, we can identify a software component by the secure hash of its bytecode. For example `{#ngv6gnrqki5vgnyxtodjorlrwoiinmfk74gg3ftwj7byrqratmlomruoce57}` might identify bytecode that implements a fibonacci function. Theoretically, there is an opportunity here for collision in names. But, in practice, that probability can be reduced well below the threshold of unlikely problems like cosmic rays and meteor strikes. The convention is that this invocation shall search for a resource, download and validate it if necessary, then cache and compile locally. But this convention can be tweaked for circumstance, e.g. shifting the compilation task to a trusted proxy. If the resource cannot be obtained, the program simply fails. If space is needed, old or infrequently used resources (or perhaps just their compiled objects) can be discarded from cache. Search locations for download might be extended by use of annotations.

Use of a secure hash instead of human names avoids a lot of problems: update and versioning concerns are cleanly separated, there is no cache invalidation, we can download the code from anywhere, there is no trouble using proxy repositories. The design is highly compatible with open and distributed systems and streaming code. The hash can even serve as a secure capability of sorts, depending on the protocol by which we download the resource. I believe Awelon’s design is theoretically superior to most conventional approaches, but it has yet to be implemented.

In context of precompiled words, it seems feasible to start using this technique directly: code of the form `#foo` might compile to a `{#secureHash}` resource instead of expanding fully into the underlying ABC. (For debugging, this might become `{#secureHash (#foo)}`.) This would make concrete what’s happening with respect to space savings, separate compilation, and reuse, i.e. because each compiled element has a fixed size in the ABC stream. Providing developers the control here could prove immediately useful. Long term, however, division of a program into optimal reusable subcomponents might be mostly automated.

So… coming back to JIT.

The earlier difficulty with JIT for ABC was the lack of a stable location or names. However, these precompiled words should serve as moderately effective JIT targets. Naively, we might simply compile each word the first time it is required. But more sophisticated approaches involving counters or whatnot are certainly feasible.

I think I’ll give these precompiled words a try. Reviewing this article, it seems I favor the `#` prefix for precompiled words, perhaps due to the allusion to `{#secureHash}` resources. Hopefully, this really will help with performance by reducing the memory and cache burdens.

Addendum 30 June

I’ve played around with using word prefixes to guide compilation, but I find this fairly inconvenient and rigid, requiring too much editing of the dictionary to test slightly different configurations. An alternative I’m now considering is to use a separate configuration value, which simply contains a list of words to precompile (one per line).

This entry was posted in Language Design, Modularity, Open Systems Programming and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s