Out of the Tarpit

I recently re-read Out of the Tarpit by Ben Moseley and Peter Marks. This paper describes a strategy for eliminating accidental complexity from software. Some of their conclusions:

  1. input is modeled by outside agents updating relvars
  2. output is modeled by outside agents observing relations
  3. logic is expressed by deriving relations and data
  4. most state is cached computations for performance
  5. no need for control flow, but performance hints OK
  6. avoid structured data abstraction and data hiding

IO is agent driven. Human users are included among these agents. But we may also have software agents that, for example, inject a twitter feed into a relvar, or observe a relation to determine feedback to twitter. Similarly we can represent posting to a forum, uploading a binary, moving a joystick, etc.. This is a very REST-ful approach to software architecture. Latency for output can be mitigated via Comet-style long polling or subscription techniques for the observers.

Modulo caching for performance, there is no hidden state. All inputs to the system are represented in non-derived relvars. Those relvars aren’t necessarily monotonic: we could represent spreadsheet-like behaviors by modifying relations in ad-hoc ways. But many use cases involve time-series data, which allows us to represent more conventional streams and logs and is especially cache friendly.

I find dubious the conclusions regarding structured data abstractions. The authors argue that when unneeded, irrelevant data is provided to a function, it becomes difficult to reason about what changes affect the output. I grant this. However the same argument applies equally to logic that, for example, drops columns or filters rows (aka project or select) from relation arguments.

Interestingly, I feel the authors only accidentally struck a critical point regarding the nature of complexity and reasoning, that algebraic composition is critical for reasoning at very large scales. By algebraic composition I mean a system that includes:

  • a set of composable elements, components
  • a uniform set of composition operators
  • algebraic closure when applying operators
  • compositional properties: invariant or inductive

Compositional properties allow us to reason (both intuitively and formally) about how systems will behave as we grow them, without requiring deep knowledge of the internal structure of the components. Functions and relations both exhibit algebraic composition.

Of course there are many more models out there with useful compositional properties: grammars, object capabilities, algebraic data types, CRDTs, free monads, etc.. By restricting structured data abstraction, it becomes difficult to represent what might be a more appropriate abstraction for a particular use case. 

Contrasting with Awelon project and Wikilon, my dictionary applications rely in a very similar manner on external agents for IO and cached computations as a basis for most application state. But I don’t provide any equivalent to the proposed separation between relvars and derived content. One motive for mixing is: transparent procedural generation and semantic compression, the ability to transparently template data or represent macros and elevate the level of user input without tweaking downstream data processing to receive a new type or source of input.

Also, one of my hypotheses is that computation is fundamentally mechanical in nature, that software is a physical medium like shaping clay or stone. Awelon project takes structure and flow of data to be essential complexity, important to make explicit, though we’re free to abstract it via yet another computation. I’m interested in relational programming within a functional system, but I gravitate towards solutions like µKanren as an EDSL for this purpose. Pure `state→state` functions, especially in context of substructural types and tacit actions seem a good fit for modeling human inputs, which are both mechanical and meaningful.

It will be interesting to see whether the ad-hoc purely functional structure and data abstractions of Awelon project and Wikilon fall afoul of the complexity predicted in this paper, or how much we use relational EDSLs in practice.

 

This entry was posted in Language Design. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s