Rewriting is not an unusual basis for programming language formalization. Lambda calculus is frequently expressed in terms of rewriting, as are combinator logics. However, rewriting seems to be unusual at the implementation layer. Instead, we model environments, stacks, none of which are represented in the original program. We abort computation upon encountering free variables or stack underflow, neither of which are errors when the same program is placed in a different context. We worry over control flows and eager vs. lazy evaluation and parallel strategies, all the while struggling with constant propagation and partial evaluations.
Suppose we implement a language primarily by rewriting. We can feed our program to our evaluator, together with a space and effort quota. It chugs and churns and rewrites for a while, then outputs a different representation of the same program – something we can view and edit just like the input program. (To clarify, this is not the same as generalized term rewriting. The evaluator needs only one specific set of rewrite rules, those for our language semantics.) What have we gained?
- Simplified resource management. Even upon hitting quota limits, we receive a useful, partially evaluated result. This is similar to a memory dump, but is wholly independent from the evaluator. We can continue evaluation, perhaps using a larger space quota or slicing a large computation into small, independent pieces.
- Simplified debugging on error. If our language doesn’t know what to do with (“cat”+3), then we’ll simply carry that expression into the output, giving developers a concrete view of the error. Because there is no context in which that is okay, our evaluator may even reify it as `error(“cat”+3)` so errors are easily counted, highlighted in red, discovered by text search, etc… Our program output on error isn’t a horrifying, illegible memory dump, but rather is the same language as our input and may be rendered and studied using the same tools and text editors.
- Improved debugging HCI. All program ‘state’ is explicitly modeled as part of “the program”. Thus, there is never any need to hover over or query a variable named `x` to read its value. Instead, you’ll plainly have an `x = 1234` somewhere in the program. Similarly, if a program is evaluating multiple instances of a subprogram (e.g. due to recursive loops or parallelism), you’ll be able to see those instances, which effectively gives you a rendering of the full stack and every parallel computation. Of course, we’ll need to use progressive disclosure techniques for rendering of very large structures. But that can be achieved by simple heuristics or data type annotations, is entirely data oriented, and is a much easier problem to solve.
- Simplified breakpoint based debugging. We can reify breakpoints within our language as `@foo(P)`, configured by name as an additional evaluator input. If inactive, @foo(P) returns P. If active, our evaluator rewrites to `@foo:123(P)` (with `123` unique in the output) and treats the breakpoint as a barrier to use of P, but may evaluation of independent subprograms. Our output – the partially evaluated program – can be rendered as plain text (or using our normal program editor), and active breakpoints are easily highlighted, found by searched, and deleted by hand or by pattern.
- Simplified log based debugging and profiling. We can trivially configure some breakpoints instead as ‘log points’. For example, @console(P) might both return P and print P to the console. A simple convention would an extensible set of names default to logging (e.g. @log:foo). Of course, logging doesn’t provide much context. We can do better with history based debugging.
- Simplified history based debugging and program ‘animation’. We can animate on arbitrary breakpoints like `@foo` by simply evaluating as far as we can go, taking a snapshot, dropping all the `@foo:*` breakpoints, then repeating. We can animate on multiple breakpoints choosing a strategy: pararallel (@foo and @bar release together), hierarchical (as many @foo frames as we can manage, then one @bar frame, repeat), interleaved (@foo @bar @foo @bar), etc.. Because we’re halting on developer selected breakpoints (rather than time quotas) the intermediate programs will be predictably structured, easy to analyze and debug. Ultimately, we have many megabytes or gigabytes of highly compressible program-snapshots that could easily be rendered into a GIF image.
- Simplified partial evaluation and staging. With rewriting, partial evaluation is the normal case, i.e. we can take any subprogram and evaluate it as far as it goes, independently of input. With a few annotations like `asap(P)`, we can tell an evaluator to prioritize evaluation deep within a program, giving us effective control of staged computing, no need to juggle anything.
- Supports expressive spreadsheet HCI. In a typical spreadsheet, cells are inexpressive programs that reduce to an anemic data model. With program to program rewriting, cells can express anything our language can express – e.g. functions, objects, databases. Even when a cell contains a function, it’s the partially evaluated function, which allows for convenient inspection. Given expressive types, we can support have flexible rendering options. Using the ‘animation’ idea, spreadsheet cells could additionally render their own evaluations in a loop for anyone who is interested.
- Simplified integration of DSLs. In context of rewriting, it is easy to leverage partial applications into partial evaluations. Thus, DSLs need be no worse than the host language for performance, not even for external DSLs. This does leave some interesting challenges, such as tracing errors back to lines in the original DSL code. (I do have some fledgling ideas about annotations for tracing, e.g. configuring a @gate to add hidden, contagious markers to data, perhaps with a limited generation count. Such a technique would be useful for active debugging in general.)
- Simplified linking. Programs are their own representations, so we only need to deal with the problem of linking values. Linking values is a much easier problem compared to linking of mutable things, e.g. leveraging secure hash URLs. With program rewriting, a runtime link failure isn’t a huge problem. The only consequence is evaluation stalls, much like failing on an effort quota. The output remains a valid program and may easily be continued later, e.g. when the network is up.
- Simplified big data handling. Because runtime linking is lightweight, secure, and fail-safe, we can feel free to tuck things behind links at runtime. For example, if we don’t expect to need P any time soon, we could say `stow(P)` and we get back a link reference to P that may later be loaded. This technique can be directly applied in the construction of data structures – log structured merge trees, hitchhiker trees, finger-tree deques – that may be vastly larger than working memory.
- Simplified compilation model. When linking subprograms by secure hash, we are not restricted to loading the plain text representation. We may instead load a cached representation that is heavily optimized for the runtime, where the binaries are compact and the functions run fast. We may use annotations like `jit(P)` or `optimize(P)` to encourage the runtime to perform compilation. Interestingly, linking subprograms by secure hash gives us shared objects by default.
- Simplified IO. Assume we model a `type Process = Request → (Response, Process)` and hook it up to a network socket. Request, Response, and Process are programs, because all values in program rewriting are represented by programs. They come with plain-text representation, maybe even a type checker. This eliminates a lot of ad-hoc make work surrounding serialization and input validation. Depending on their types, Request and Response may be very expressive, e.g. containing processes themselves, linked references to other data, and procedurally generated structure. Or they might be plain old data types. Injection and extraction of data by other languages needn’t be any more difficult than working with JSON or SQL.
- Simplified transparent persistence. In the aforementioned process model, the Process is easily checkpointed to plain text, and completed Requests are easily recorded to a log. So we can easily use a simple checkpoint persistence model, heuristically updating the checkpoint and clearing the log. This, again, follows from everything being an independently serializable program.
- Simplified parallelism. Rewriting multiple parts of a program is the default case rather than the a special case that requires a lot of effort to make happen. Because even the most lightweight of parallelism tends to have some overhead, we might still use explicit annotations like `par(P)` to indicate parallel evaluation. However, we don’t need to make special efforts to ‘touch’ that subprogram with the a control-flow cursor. Even when running sequentially, a good evaluator will occasionally yield work on one part of the program to go work on another, both to avoid burning a limited effort quota on an infinite loop and to improve the debugging experience.
- Simplified distributed computing. Evaluation of a very large program can be divided among multiple processors in a cloud or mesh network. If a node fails, the worst that happens is we need to recompute some subprograms, and we can limit rework by performing occasional checkpoints. Links to big data or parallel/lazy computations can be communicated cheaply, with the data being serialized only when necessary, which makes it easier to pipeline data through subprograms that are mostly doing data-plumbing work. With annotations like `remote(Node,P)` we can potentially support declarative distribution.
A lot of difficult or annoying problems are simplified upon fully embracing `Program → Program` rewriting as our primary evaluation model. Also, I do not believe performance will suffer, not if we dedicate even a moderate fraction as much time to taking program rewriting seriously as we have historically dedicated to imperative programming.
Has the mainstream computing industry hobbled itself by sneering at rewriting? I am starting to thing so.
And I am now giving serious consideration to replacing my project’s existing bytecode with a variant more friendly to rewriting.