Wikilon Dictionary Applications

A repeating cycle of simplification and refinement has fine-tuned Wikilon since I introduced it over a year ago. One area where Wikilon has changed significantly is its application model, that is: how I expect to host server-side applications and debugging. The idea now is to use AO dictionary applications for everything, including edit sessions and issue trackers.

Dictionary Applications

An ‘AO dictionary application’ is any application whose state is represented within a healthy AO dictionary. An AO dictionary is a simple association of words to definitions. A ‘healthy’ AO dictionary is acyclic, fully defined, and well typed. AO is a purely functional subset of Awelon Bytecode (ABC) that uses of `{%word}` tokens to link definitions of other words. Linking a word is trivially equivalent to inlining its definition.

Humans usually don’t mess directly with AO bytecode. Instead, they leverage editable views such as Claw, a Forth-like command language, or a more structured variant.

Dictionary applications take this ‘editable views’ idea to a further extreme. Application views may freely involve multiple words and definitions from the dictionary, including evaluation of functions. One very useful pattern for dictionary applications that leverages evaluation is the command pattern. Represented in an AO dictionary, the command pattern looks a lot like:

@foo.v0 (creates initial foo)
@foo.v1 {%foo.v0}(command to update foo)
@foo.v2 {%foo.v1}(another command)
@foo.v3 {%foo.v2}(yet another command)
@foo {%foo.v3}

Here, the word `foo` represents an application object, perhaps a database or a canvas or a collection of communicating processes. With each new command, we add a new definition (e.g. `foo.v4`) then update `foo` to point to the new ‘head’. To obtain the current state of `foo`, we naively would evaluate `foo` from scratch in a provided context. Fortunately, the command pattern is cache-friendly: if we have already computed `foo.v2` in the same context, we can reuse this and evaluate just the last two commands. While command pattern would conventionally be used to present only the ‘current’ state of an object, some applications like REPLs (or a variant with constructive graphics, perhaps inspired from iPython-notebooks) might present each command together with some computed output.

Between hosting state in source, and the command pattern, dictionary applications have many nice features:

  • applications are persistent, portable, versioned like source
  • live coding and continuous testing become implicit
  • abstraction, compression, indexing of state and commands
  • excellent fit with HTTP and REST architectural styles
  • cache-friendly patterns for common updates
  • unlimited undo of commands and dictionary manipulations
  • back-in-time debugging via command or dictionary histories
  • trivial forking, cloning, snapshots of applications

Dictionary applications aren’t limited to explicit ‘refresh’. Trivially, an application could naively poll the dictionary to maintain a current view. But real-time feedback is feasible with Comet technologies like AJAX long-polling or subscription. Real-time views offer a convenient basis for both live coding and communication between agents that share the dictionary.

A dictionary is effectively a filesystem with a rich, computational semantics. Unlike conventional filesystems, a dictionary requires no special consideration for redirects, ad-hoc composition of content, or dictionary compression and procedural generation.

Note: Dictionary applications subsume and simplify a lot of my older ideas, including Wikilon filesystem, abstract virtual machines, extensible syntax via Claw-like views, and embedded literal objects via command patterns.

Real World Effects

Dictionary applications, in their natural environment, are implicitly multi-agent systems.

If our only agent is a human, we can model calculators, REPLs, interactive fictions, and other applications that remain ‘paused’ while waiting on the next human action. If we assume multiple humans, we can suddenly model more interesting applications like forums, shared chatrooms, chess, multi-user dungeons. Communication between humans becomes an implicit effect: humans observe updates caused by other humans. With software agents, or ‘bots’, we can support ad-hoc effects. For example, a human might add a request to a shared chatroom that requires querying an external database. A bot sharing the same chatroom then queries the database and injects the result back into the dictionary to fulfill the request. With software agents, we can model command shells, or time-dependent systems that we iteratively ‘step’ forward until a stable state is reached.

For software agents, effective control becomes important. This is achievable with attributes, i.e. annotations that have no runtime semantics. In the Claw view, I express attributes with parentheses around a list of words, e.g. `(license:BSD3 author:dmbarbour category:math)`. This desugars to a dropped block of annotations `[{&license:BSD3}{&author:dmbarbour}{&category:math}]%`. We can use attributes for all sorts of things: deprecations, todos, indexing, authorship or licensing where it matters, etc.. For bots in particular, we can use opt-in attributes like (allow:myChatBot) or opt-out attributes like `(noindex)`. I generally favor opt-in attributes in cases where a bot will modify the dictionary.

Concurrent Applications

Evaluation within a dictionary is deterministic. Despite this, it is frequently to our advantage to explicitly model concurrent systems, e.g. communicating processes or actors systems. A ‘concurrent system’ is reified as a dictionary object that we can ‘run’ a few steps at a time or interfere with in simple ways (e.g. injecting messages or twiddling state). Our applications are then modeled within these concurrent systems. This pattern buys us several features:

  • ability to iteratively ‘run’ the system a few steps at a time
  • implicit step-based debugging below granularity of ‘commands’
  • ability to interrupt or interfere with applications after any step
  • ability to extend applications by injecting new computations
  • animate, graph, or scrub rendered views of application state
  • easy integration of multiple agents with active application

Debugging concurrent applications hosted in a dictionary is much less unpleasant than doing the same for applications hosted on hardware. For example, there is no risk of heisenbugs. We’re free to step back in time via the command pattern. We can explore multiple valid progressions of non-deterministic concurrent applications (e.g. by using different PRNG seeds to shuffle messages).

The main feature we lose is hardware races for evaluation. We cannot rely on timeouts as a basis for hand-wavy soft real-time behavior. If we need well-timed applications, we’ll need to use explicitly timed programming models (that allow us to express how much time a computation should take, e.g. in terms of seconds or musical beats).

Managed Dictionaries

Preserving a long history for command pattern objects isn’t necessarily a problem. In cases where ‘commands’ are due to discrete and infrequent human actions, such as mailing lists and forums, it is entirely feasible to keep a complete history. Terabytes of storage are quite affordable these days. However, in the general case we might want something like garbage collection for our application histories. Besides GC, I also imagine wanting to automatically refactor, compress, and optimize code in our dictionaries.

As with real-world effects, automatic management of our dictionaries is feasible if we leverage software agents guided by attributes. In particular, I envision tagging words with three attributes to guide automatic management:

  • hidden – avoiding external references; may delete this word if unused
  • opaque – only care about behavior of this definition, not structure
  • frozen – committed to this definition; may inline into opaque definition

Opaque words allow for automatic refactoring, inlining, etc.. Frozen words may be inlined. Hidden words may be GC’d. We can model words with all three attributes, easily enough. Additional attributes might be leveraged to implement ‘gradual’ decay models (e.g. exponential decay). For example, an attribute could represent an expiration condition managed by an independent software agent.

Regarding ‘Big Data’ Applications

It is feasible for application objects to contain multi-gigabyte databases, FTP services, and the like. In conventional programming systems, we’d be forced to shove data out to an external database or filesystem to prevent memory overload. However, Wikilon is designed to address this problem without external dependencies. Instead, we annotate large values for external storage via `{&stow}`. Stowed values are serialized to cold storage but will implicitly be loaded and cached in main memory when necessary. My current implementation uses VCache, which uses a LMDB.

By stowing nodes in a large trie, we can easily model filesystems and databases without loading them into memory all at once. Using finger-trees, we can model larger-than-memory logs and queues. These large values are held by the cache. If we update something far upstream of our cached values we may need to recompute our application states. However, the common case would involve incremental updates and viewing only ‘final’ states.

Parallelism also helps with scalability because it can help quickly recompute a large application state in cases where our updates do not align nicely with our incremental models or existing cache. AO code can easily leverage Haskell’s par/seq parallelism with a few annotations. Long term, it is also feasible to optimize specialized subprograms akin to Haskell’s accelerate package.

Extracting Applications

As Wikilon is a development platform, we’ll sometimes want something closer to the ‘conventional’ edit-compile-test style. For example, we might want to provide a binary for use on an Android phone. This is achieved through an extraction process. The extracted binary is essentially a view on the dictionary, though not a particularly editable one. Conveniently, extraction could be presented to users as a simple GET on an HTTP link, or via a form that constructs an appropriate link (with various compilation flags).

Our extractor must know how to compile a popular ‘type’ of applications for an appropriate back-end. This is easiest for applications modeled as objects with the command pattern, and especially those modeling a concurrent system. Command pattern objects are easy to separate from the dictionary because their ‘state’ doesn’t involve parsing bytecode, and they clearly have ‘updates’. A self-describing object would be even better, i.e. one that can be queried for available commands. Concurrent systems enable us to have some commands take time while permitting interruption or interference, and thus are a good fit for many target systems.

The ability to extract applications in medias res is valuable. This allows us to easily ‘create’ new applications through incremental interactions with existing applications, like working with a spreadsheet or preparing a form so it is mostly filled. So, I think dictionary applications will help a lot even in these cases where we plan to extract a separate binary.

Advertisements
This entry was posted in Language Design, Live Programming, Modularity, State, User Interface. Bookmark the permalink.

2 Responses to Wikilon Dictionary Applications

  1. Pingback: Wikilon Performance Strategy | Awelon Blue

  2. Pingback: Out of the Tarpit | Awelon Blue

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s