Hierarchical Transaction Machines

Hierarchical transaction machines (HTXMs) are an interesting idea I’ve been developing recently. HTXMs accomplish my goals of live, resilient, declarative, extensible software systems. Further, HTXMs accomplish these goals in a way that’s easy to explain and easy to integrate with existing service and device.

First, let’s ignore that ‘hierarchical’ adjective. I’ll get back to it!

A transaction machine (TXM) is essentially: a deterministic transaction, repeated indefinitely.

We assume our system has a set of TXMs. Non-determinism arises due to an arbitrary schedule for a TXM executions within the set. Process coordination is implicit – if a transaction is unproductive, then deterministic repetition would still be unproductive, so for efficiency we simply wait for a change to an observed variable. Programmers can leverage this, for example by aborting a stream-processing transaction when the input channel is empty. External agents and devices are implicitly modeled as TXMs interacting asynchronously and atomically through shared variables.

TXMs have many nice properties:

TXMs support liveness: We can implicitly read our TXM behavior definition at the start of each transaction. This enables us to modify system behavior atomically. This can be leveraged for process control, safe runtime upgrades, and developing application models around live-coding.

TXMs have robust process coordination: There is no need for polling, locks, signals, or interrupts. There is no risk of deadlock or dropped signals. We can wait on arbitrary conditions without specific condition variables. A process scheduler can ensure fairness or prioritize transactions that touch HCI variables.

TXMs are robustly securable: we can define TXMs as operating on an abstract environment provided as a parameter, without static variables. This makes it easy to sandbox the environment, or to simulate it for testing purposes. We can safely work with distrusted TXM definitions by shallowly wrapping them.

TXMs are openly extensible: TXMs are anonymous and operate on a shared environment, like a multi-agent blackboard system. It’s easy to add new features to a system by introducing TXMs that observe or tweak some variables also used by other TXMs. It’s equally easy to remove features.

TXMs intrinsically are idempotent: introducing more than one copy of a TXM won’t affect a system’s logical behavior. This allows us to reason declaratively about whether or not a system has a particular behavior or extension. A set of TXMs is also commutative, but that’s not unusual for concurrency models.

TXMs are fault tolerant and resilient: via hierarchical transactions, we can robustly support strong exception safety with fallbacks for graceful degradation. In a set of TXMs, even if some operate in a failed or degraded mode, the others can operate normally. After the cause for failure is fixed, a TXM will swiftly recover to optimal operating conditions.

TXMs are suitable for real-time programming: it is not difficult to develop a scheduler that will guarantees time slots for hard real-time mission-critical TXMs. We can prioritize transactions responsible for HCI to ensure low-latency reactivity. Compared to threads, TXMs give us very nicely bounded operations for scheduling.

TXMs are easily debuggable: It’s easy to capture and simulate the variable environment observed by a transaction for error reports and regression testing. If transactions return a value on error or commit, we can easily record the history of such values and render it in a task manager or debugger. Debuggers can be supported as normal extensions to the system, allowing ad-hoc views and direct manipulations, rather than hacked in via special compiler modes.

TXMs are simple and familiar: Asynchronous effects through variables are an excellent fit for network access, task queues, direct memory access, publish-subscribe, and other common software design patterns. We can leverage conventional object-oriented interfaces or opaque data types to help protect system invariants. It’s easy to explain and understand TXMs, and to integrate them with real-world systems.

Hierachical TXMs (HTXMs) are an extension to TXMs to support task-parallel concurrency and multi-stage programming while preserving liveness.

The idea for HTXMs is to form a read-fork tree with read-write loops at the leaves. After a change to an observed variables, we might rebuild a branch of the tree, or even the whole thing, thereby preserving liveness. But insofar as we stabilize the tree (via design patterns, caching and buffering, etc.), computation will mostly occur at the leaf nodes.

To support HTXMs is not difficult: we simply add a `fork` operation to our transactional API, which indicates we should attempt a transaction after we commit. Forked transactions can inherit the read assumptions of the parent. Then read-fork operations can be optimized. (This does awkwardly permit expression of mixed read-write-fork, but we could handle that arbitrarily or treat it as a runtime error.)

Conveniently, a single HTXM can model an entire set of TXMs. We can also hierarchically embed HTXMs.

The many advantages of HTXMs come at a cost: rework. When a transaction aborts after some effort, we have wasted at least part of that effort. The premise for TXMs as a practical model is that the amount of rework can be controlled to an acceptable level. Design patterns can help a lot to focus a system on productive work. We can also develop heuristic schedulers that are sensitive to read-write conflict histories to reduce rework due to concurrency conflicts.

Caveat: TXMs are unsuitable for distributed processing. I assume network access is modeled as an effect. If users wish to model PAXOS or CRDTs, they should do so explicitly. (As an exception to prove the rule, TXMs over RDMA might be acceptable.)

Posted in Language Design | Leave a comment

RDP Simplified

Reactive Demand Programming (RDP) is a reactive programming model I developed circa 2010-2012. The essential idea behind RDP is that “effects” should be commutative and continuous.

What I mean by “continuous” is that behaviors must be expressed statefully, attached to stateful representations. Continuity makes it easy to observe and directly manipulate system behavior, which is valuable from a human-in-the-loop perspective. Meanwhile, commutativity means that the order of expression for effects must be irrelevant. This property simplifies reasoning about concurrent behaviors, both within a program and for multi-agent systems.

To achieve commutative and continuous effects, RDP uses a concept of “demands“. A demand is a continuous request from a program (or subprogram), which may be observed and fulfilled by other programs (or subprograms). Demands are only observable collectively, as a set, not individually.

Unfortunately, my original expression of RDP was complicated by three factors:

  • Expression of RDP programs was arrowized, which is relatively painful to work with when compared to monadic programming.
  • I wasted efforts to support “instantaneous” reactive loops, i.e. such that we can observe effects without logical delay.
  • I required idempotence, such that duplicate expression of effects do not contribute to system behaviors.

Monadic expression of RDP programs is not difficult, especially if we remove instantaneous loops. A simplified monadic RDP API:

typeclass (Monoid a) ⇒ Bag a // requires commutative mappend
typeclass (Monad m) ⇒ RDP m where
    type Var m : * → *
    type Dir m : *
    dir : Dir m → String → m (Dir m)
    var : (Bag a, Typeable a) ⇒ Dir m → String → m (Var m a)
    get : Var m a → m a
    put : Var m a → a → m ()

Unlike conventional monadic effect loops, an RDP monadic behavior is applied contininuously, repeating at each time-step. We have a notion of persistent, labeled variables that are available from one time-step to the next. When we get a variable, we’ll receive the value from the previous time-step. When we put, we contribute our value to the next time-step. All contributions within a time-step are unioned together, via commutative mappend. 

By relaxing idempotence – that is, by using bags of demands instead of sets – we can add entries together. This is convenient for modeling votes, priorities, weighted preferences.  We can similarly model negation or suppression, which is convenient for modeling rules or constraints (see Behavioral Programming for examples of how this might be useful).

Although we do not allocate ‘new’ variables, a notion of local state is still valuable for RDP systems. For the API above, I borrow the concept of filesystem-like directories. Local state is achieved by assigning a distinct subdirectory to each distinct module or subprogram, such that subprograms cannot accidentally (or maliciously) interfere with each other. We can feasibly support object-capability patterns involving sharing of vars. Of course, a major goal of RDP is to preserve access to stable views of state for monitoring, extension, debugging, etc.. Thus, local state is never hidden from the parent process.

Effects in RDP system are simply modeled in terms of shared variables between a subprogram and its environment – which at the top-level should be a runtime or operating system. A subprogram can output demands – e.g. to display a window, to move a robot – into a variable that the environment observes. The environment can provide inputs into a variable that our subprogram observes. (The specific variables involved can easily be hidden behind a record of monadic operations/capabilities.) Of course, providing all possible inputs at all times would be inefficient. But we can leverage flexible patterns that involve registering available services or resources, or use of outputs to demand specific inputs.

In this sense, RDP has some similarities with tuple spaces or blackboard systems, albeit with RDP variables being more fine-grained than blackboards. There is also a similarity with functional relational programming, as described in Out of the Tarpit by Ben Mosely.

Implementing RDP efficiently remains a challenge. In general, we could try to optimize evaluation of an RDP subprogram by memoizing its partial evaluation and only re-evaluating sub-computations whose observable results have changed. This might require an explicit memoization combinator, with a comparable result type. Extending the monad with for explicit fork-join concurrency (e.g. fork : m a → m b → m (a * b) and fork_ : m () → m b → m b) may also help in isolating evaluation of subprograms. However, an efficient implementation of a monadic RDP API should at least be simpler than efficient implementation of arrowized RDP, by virtue of monads being simpler than arrows.

Posted in Language Design | 6 Comments

Monadic RDP

Reactive Demand Programming (RDP) is a programming model I developed circa 2010-2012. I’ve tabled it for a few years to work on other project goals, and since then I’ve also become interested in Kahn Process Networks (KPNs) as a simpler alternative in some ways. But RDP still is an interesting paradigm, with features convenient for live programming and open, distributed systems.

The essential idea for RDP is simply this: concurrent effects should be commutative and idempotent.

That is, expressing a specific request twice is the same as expressing it one or three times. If we have two different requests at the same logical time, the order in which we evaluate or observe them should not matter. With these properties, we preserve a lot of the nice equational reasoning and refactoring features associated with purely functional programming, and a lot of declarative reasoning is feasible. Essentially, effects are sets of declarations for input or output.

There are also some secondary ideas developed for RDP, such as precise logical timing. You can search this blog for more on the subject.

Anyhow, the original RDP models are arrowized, but monadic expression of behaviors would likely be more convenient.

Instead of normal imperative variables, which have one value at a time, our monadic RDP system will only have set variables, each with a label. Writing a variable simply adds a given value to the set. Reading the variable will return the prior value at the label, which will be a set.

The sets observed upon reading a variable must have deterministic ordering and duplicates removed, so we only support comparable value types. Actually, rather than sets we could use any commutative, idempotent monoid type. But sets are the most obvious such type.

Like RDP arrowized behaviors, our monad value would be applied continuously, corresponding to a stateless behavior applied in just one time-step. Like arrowized RDP, we could also support a “delay” operation that enables RDP behaviors to be distributed over multiple time-steps, but we might need to constrain the number of delay steps (otherwise we end up with untethered, fire-and-forget RDP behaviors – which is interesting, but undermines any live programming features). Effects would be modeled in terms exposing a subset of variables for external input or output. We might also add a few auxiliary features, such as single-assignment (per time step) labels, value suppression, forking of behavior threads, hierarchical structure for labels, and so on.

Posted in Concurrency, Distributed Programming, Language Design, Live Programming, Open Systems Programming, State | Leave a comment

Awelon Dictionary

A goal of Awelon project is to make software artifacts easy to share, modify, interact with, and compose. Sharing is easiest for documents, images, and similar artifacts that have no external dependencies. For Awelon, I apply this concept to the codebase.

An Awelon codebase is called a “dictionary”. It’s a big set of definitions that are versioned, curated, and maintained together. A dictionary does not depend on external definitions beyond the few Awelon primitives. This opposes the conventional “library” model where we use ad-hoc dependency and configuration managers to glue together interdependent subsets of definitions.

An Awelon dictionary is not limited to defining functions. It can easily contain definitions for data – maps, music, almanacs, articles, paintings, presentations, fonts, forums, books, or blogs. Binary objects may be embedded indirectly via %secureHash references.  Effectively, a dictionary serves as a “smart” filesystem with potential support for typechecked data, partial evaluations, transparent procedural generation, and spreadsheet-like views. In some cases, soft real-time updates are possible, e.g. adding telemetry data for moving objects.

(Aside: We could follow Haskell’s example of modeling a process with a `main` function and monadic effects detached from dictionary state. However, Awelon project is exploring alternative application models to make computation more accessible and visible. Software agents recording application state in a codebase seems a promising basis.)

To share entire codebases, we may logically embed one dictionary within another, such that dict/foo represents access to foo under the named dictionary. This structure avoids naming conflicts and securely controls dataflows – i.e. data can only flow outwards from `dict/` to its host. However, because dictionaries have no external dependencies, they’ll tend to replicate common function definitions (such as arithmetic and list processing).

Ultimately, Awelon dictionaries are massive, widely shared, largely similar, and frequently updated. In context of use, I desire incremental compilation that is optimally shared with equivalent definitions in similar dictionaries.

It took me a long while to develop a suitable representation.

I eventually found inspiration from log-structured merge-trees, radix trees, and use of secureHash for acyclic reference between binaries. Proposed representation:

/prefix1 secureHash1
/prefix2 secureHash2
:symbol1 definition1
:symbol2 definition2

Each dictionary node is represented as a series of updates. We can either update a symbol (defining or deleting it, `:` or `~` respectively) or an entire prefix `/`. In the latter case, we use a secure hash to identify another dictionary node, and we strip the matched prefix from symbols within that node.

To find the definition for a symbol, only the last update with matching symbol or prefix is examined. Hence, this representation can be used as an append-only log for streaming updates. For example, an update to `/p` will mask prior updates to `:poke` and `/prod`. We assume the new node at `/p` will contain those definitions if we intend to preserve them. The empty prefix (as in `/ secureHash`) is valid and useful. It masks all prior updates and essentially represents a checkpoint or reset in context of a stream, or a prototype inheritance as the first update in a node.

Hierarchical dictionaries may be naively included. Using `/dict/ secureHash` we can logically copy an entire dictionary. The radix tree structure that strips matched prefixes is essential for structure sharing. With `:dict/foo def` we can deeply update an individual definition.

Because recent updates are buffered near the root node, we effectively have lightweight working sets with near O(1) update. Because inner nodes are identified by secure hash, we obtain a lot of structure sharing and reasonably efficient diff computations to help maintain a cache of evaluated or compiled forms.

It seems this very simple dictionary representation can support most of my goals. I can even feasibly support distributed storage and lazy download of secure hash references, so clients can work with dictionaries much larger than a single disk.

Posted in Language Design | Leave a comment

Awelon Labeled Data

Awelon is a minimalist language, with four confluent concatenative combinators:

[B][A]a == A[B]     (apply)  
[B][A]b == [[B]A]   (bind)
   [A]c == [A][A]   (copy)
   [A]d ==          (drop)

The motivation for minimalism is simplicity. Awelon project’s goals include that every feature be simple, comprehensible, trustable – from language primitives up to multi-agent software systems (via shared monotonic dictionaries, etc.). So I construct programs from these few primitives. Of course, there are also some disadvantages for performance and syntax. For Awelon, these disadvantages are ameliorated by use of acceleration and projectional editing.

Originally, my intention has been to leverage acceleration and projectional editing even for labeled data – records and variants – in Awelon.

The main benefits of labeled data are extensibility, commutative structure for refactoring, and human-meaningful documentation. Labels can be modeled as paths through a trie, where the path encodes the label. Unfortunately, I’ve discovered that benefits of labeled data seem to be weakened when we choose a concrete encoding of labels and records. It becomes difficult to reason about which observations can be made on a label or record, or how accessors and records might be constructed dynamically. It’s even difficult to preserve meaningful labels when reporting type errors.

I’ve chosen to provisionally add label-primitives to Awelon that work as follows:

[[A] :a [B] :b [C] :c] .b == [B] [[A] :a [C] :c]
[A] :a [B] :b == [B] :b [A] :a  (labels commute)

Logically, these operate linearly on row-polymorphic abstract record constructors:

:a   :   {R without a} [A] :a → {R, a:[A]}
.a   :   [{} → {R, a:[A]}] → [A] [{} → {R}]

The actual record value is abstract and ephemeral, never represented within Awelon code or evaluations. Clients can treat `[[A] :a [B] :b]` as a record. But it’s simultaneously a normal function subject to ad-hoc composition, abstraction, and factoring. Awelon’s abstract record model forbids shadowing of labels and has linear access by default. Records don’t provide any features we couldn’t equally achieve using tuples `[[A][B]]` with extrinsically labeled positions.

For labeled variants, consider that basic sum type `(A+B)` has a Church encoding `∀r.(A→r)→(B→r)→r`. That is, an observer will supply a “handler” for each case, and the value itself selects and applies one handler, dropping the others. For labeled sums, we simply need a labeled product of handlers – `[[OnA]:a [OnB]:b [OnC]:c …]`. Variant values could be given concrete representation such as `[.label [Value] case]`. We need projectional editing to make this pretty, but the primitive syntax is reasonably compact, self-documenting, and close to the intended meaning.

I’m somewhat reluctant to add this primitive concept to Awelon language. I feel labeled data is sufficiently valuable for extensibility, commutativity, and human-meaningful documentation to warrant some special treatment. But there are also some risks – use of labels adds a lot of bloat to data representations, for example. Labels themselves will tend to bloat data, and introduce pressure for naming of things. Many data structures such as lists or trees can get by with just a few labels, so positions work well enough. This features currently has a “provisional” status in case I find a better approach in the future.

Posted in Language Design | Leave a comment

Lightweight Staged Programming for Awelon

Multi-stage programming is essentially disciplined partial evaluation, a widely useful optimization technique especially in context of DSLs. By ‘disciplined’ I mean programmers can comprehend and control when computations occur, and statically debug staging errors. This is valuable for avoiding hidden performance regressions.

Yesterday, I was inspired and found a simple way to support multi-stage programming in Awelon by introducing two annotations:

[F](now) == [F]     iff [F] doesn't contain (now)
[F](later)          masks F from analysis by (now)

For Awelon, partial evaluation is simplified by lack of entanglement with variable environments and local rewrite semantics.  Hence annotations do not need to drive partial evaluation. They only offer means to declare requirements and debug errors – provide some discipline. I can separately develop models (DSLs, monads, etc.) for convenient multi-stage programming.

The (now) annotation is essentially an assertion that peeks deep into otherwise opaque representations. The (later) annotation enables abstraction of staging requirements and working with more than two stages. Global function definitions should be implicitly staged, such that unmasked (now) indicates static evaluation in that context, and we can safely assume named functions are statically stage safe after compilation.

These annotations can be checked dynamically, but it’s feasible to statically compute whether runtime staging will be correct, assuming we know the arities of all the things. Our inferred types would need to record for each input whether any (now) annotation will be exposed due to partial evaluation if just that many inputs are provided.

Aside: Use of (now) and (later) is inspired from temporal logic. Temporal logic is a a natural fit for multi-stage programming, but I think most languages wouldn’t support such a lightweight adaptation.

Posted in Language Design | Leave a comment

Labeled Data: Records and Variants

Records and variants are basic labeled data types. A record might be expressed as `{x:12.1, y:13.7}`, having a value for each label. A variant might be `Damage:11.3` or `Say:”Hello”`, and involves selection of one label and an associated value. We can generally combine these to have a variant of records or a record of variants. Labeled data is convenient insofar as they are self-documenting (human meaningful) and extensible. Most modern programming languages support labeled data types as a built-in feature. A few have explored fluent extensibility under the names “row polymorphism” or “polymorphic variants”.

But assume we have a much simpler language. This language readily supports simple binary products (a*b) and sums (a+b). A binary product is a simple pair of values like `{fst:12.1, snd:13.7}` while a binary sum is a simple choice of two like `Left:11.3` or `Right:”Hello”`. Effectively, we have spatial labels that are neither human meaningful nor extensible. For my Awelon project, this isn’t a hypothetical scenario. Awelon can readily Church encode binary products and sums, and can even provide a conventional syntactic projection, but has no built-in type-level support for labeled data.

Given simple binary products and sums, can we recover the benefits of built-in labeled data types?

I think we can do that and better. But first a short detour. Type `0` (aka void) has no values (perhaps excepting `error`). Type 0 is the identity type for sums, in the sense that type `(0+a)` or `(a+0)` is equivalent in many ways to just `a`. Similarly, type `1` (aka unit) is an identity type for products, and `(a*1)` or `(1*a)` is equivalent in many ways to just `a`. But there are some important *structural* differences. Any non-error value from `(0+a)` will be encoded as `Right:_`. With deep sum type structures such as `(0+(0+((0+(a+0))+0)))` we end up encoding values as `Right:Right:Left:Right:Left_`.

One idea, then, is to encode our label within this Left-Right-Left sequence. One viable encoding is `(RL|RR)*LL` where `RL` means `Right:Left:` and encodes a zero-bit, `RR` encodes a one-bit, and `LL` encodes sequence stop. (This particular encoding has a convenient property of being simple and self synchronizing.) Given our string of ones and zeroes, we may subsequently encode labels using ASCII or UTF-8. This gives us a basis for variants. Similarly, records can be represented by encoding a trie on the label we use for variants. For the earlier example, directly encoding the path would give us a corresponding structure `(1*(1*((1*(a*1))*1)))`. (We might explore alternative encodings to make it easier to extend or enumerate a record.)

Those 0 and 1 types essentially represent the ‘extensibility’ options for our variants and records. We get more choices for variants, more slots for records, while the label encoding itself ensures unbounded extensibility. Conveniently, we can also use a variant label to access a record, which provides a simple basis for ‘case’ or ‘switch’ expressions. We do additionally need syntactic sugar for working with these otherwise onerous structures, and likely a compiler that accelerates them. But there is little need for special type level support. And by modeling these records and variants, we can more readily get various polymorphic features, and the ability to work with labels themselves as first-class, typeable structures.

Anyhow, while I found this idea interesting, and I haven’t seen other people approaching labeled data this way, I wouldn’t be surprised if it’s mentioned in papers older than I am. I’ll be pursuing this further in Awelon project.

Posted in Language Design, Types | Tagged , , , , , , | Leave a comment

Reactive Process Networks

Kahn Process Networks are a very nice sweet spot for functional purity and concurrency, and are widely applicable. I’ve recently considered them as a viable alternative to monadic effects models. However, KPNs are missing an important feature: the ability to merge asynchronous inputs from different channels. There is no consistent sense of time for messages within a KPN.

Fortunately, it is easy to add time to KPNs, and thus have reactive process networks:

  • Every process and message is stamped with an implicit time value.
  • Channels have logical latency, adding that latency to time stamps.
  • READ may return Nothing if no messages are available at the process time.
  • WRITE will stamp the outgoing message with the process’s current time.
  • In addition to READ and WRITE, a process may WAIT on a set of channels.
  • WAIT will advance the process time to the earliest message in the set.

For efficient distributed implementation, we’d implicitly use synchronization messages to report the advance of logical time for a waiting process. Intriguingly, it is feasible to perform some optimistic concurrency: determine likely future outputs on an assumption that no last-second messages on open input ports will change things. And of course, we’ve also solved the expressive weakness of KPNs. We can merge asynchronous messages into a stream, or represent clock-like behaviors – loops with latency.

Another interesting point is that basic KPNs don’t support bounded-buffer ‘push back’ because it can result in datalock. We can implement them that way and accept the risk, but it isn’t part of Kahn’s original model. But with reactive KPNs, we don’t as much need bounded-buffers because we can bound computations in ‘time’, e.g. by evaluating a KPN up to a given time or controlling the advance of time for open input channels.

The time stamps are logical, and are never directly observed by a processes. Time stamps and latencies may use simple natural numbers. We can ‘normalize’ time stamps within a network, e.g. by subtracting the smallest time stamp in the network from all time stamps in the network. For integration with real-time systems, our logical time stamps may have a simple mapping to wall-clock times, such as logical microseconds. We can also use logical latencies to guide physical distribution when evaluating large process networks.

The original KPN behavior is supported. We can always WAIT before READ. Or we can simply use zero-latency wiring and never advance time on open input channels. While READ does not logically WAIT, it must still implicitly wait until we determine whether or not more messages will arrive at a process’s current time.

Reactive process networks resolve my few remaining concerns with using KPNs as both a primary effects model and as a foundation for first-class high-performance distributed programming within purely functional computations. Reactive KPNs could also serve as an excellent substrate for my reactive demand programming model.

Posted in Concurrency, Distributed Programming, Language Design, Reactive Demand Programming | Tagged , | Leave a comment

Lambdas in Tacit Code

While tacit programming is an excellent default, it is not suitable for all use cases. Attempting to use tacit programming everywhere results in pointlessly onerous data shuffling. A few local names, in the form of lambdas and let expressions and lexical closures, can simplify a useful subset of code, making it easier to read and write. Many tacit languages, including Factor and Kitten, provide an ‘escape’ from the tacit style. Factor uses a macro rewrite, while Kitten treats this as a primitive feature with support from the syntax and runtime.

I am developing a tacit concatenative language, Awelon. Unlike conventional programming languages, Awelon takes an approach of storing code with a simple, uniform syntax in a database, and minimizing the core language. Awelon has become considerably simpler since last time I discussed it on this blog. If you’re interested, I encourage you peruse the linked documentation. The core of Awelon is just four primitive combinators:

Awelon Primitive Combinators
[B][A]a == A[B]           (apply)
[B][A]b == [[B]A]         (bind)
   [A]c == [A][A]         (copy)
   [A]d ==                (drop) 

Awelon additionally has syntactic sugar for natural numbers and text literals, and parenthetical annotations like `(nat)`, `(memo)`, or `(par)` for performance and debugging. But all observable behavior and data is ultimately defined in terms of the four combinators. All data is logically Church or Scott encoded as functions, even if accelerated under the hood. Because Awelon’s syntax and semantics is very simple at the storage layer, I leverage editable views for presenting a more sophisticated view to the programmer. For example, a proposed view might expand decimal number `3.141` to `[[3141 0 integer] 3 decimal]`. And `-4/6` to `[[0 4 integer] 6 rational]`.

For Awelon, an escape from the tacit style should also be an editable view – a bit like Factor’s macro expansion, but reversible because I’ll be storing the tacit representation. How to accomplish this, exactly, didn’t really click for me until yesterday when I was reviewing the Wikipedia article on combinatory logic and specifically the translation from lambda calculus to SKI combinators. I can apply a similar translation to extract lambda variables from Awelon code:

Extract to Argument
   T(X,E) extracts X from E
   ∀X,E. [X] T(X,E) == E
   T(X,E) does not contain X

T(X, E) | E does not contain X => d E
T(X, X) => i  (where [A]i == A)
T(X, [X]) =>
T(X, [E]) => [T(X,E)] b
T(X, F G)
  | only F contains X => T(X,F) G
  | only G contains X => [F] a T(X,G)
  | otherwise  => c [T(X,F)] a T(X,G)

We can implement i = [][]b a a d

This is a simple algorithm that generates efficient code. The translation is ‘reversible’ by partial evaluation of `[X] T(X,E)` to propagate the variable back into the rendered view. The rules can be simplified a bit (no need for `i`) if we assume X is a value and instead extract to `X T(X,E)`.

I find Factor’s syntax for named locals to be headache inducing. But Kitten’s `→ X;` assignment syntax is reasonably aesthetic and effectively supports both lambdas and let expressions. Consider an adaptation of Kitten’s syntax for named locals to Awelon:

View: → X; X
Source: (λ X) i

View: → X Y Z; X foo [Y] bar
Source: (λ X Y Z) d [i foo] a bar

View: 7 → X; X X * X -
Source: 7 (λ X) c a c a [*] a i -

Here the `(λ X)` is an annotation or comment that tells our editable view that we intend to present the code as a lambda expression with a specific variable name ‘X’. The details of its representation aren’t essential. What does matter is that we can use the comment to systematically recover the view from source:

Source: (λ X) i
View: → X; [X] i  
  => → X; X

Source: (λ X Y Z) d [i foo] a bar
View: → X Y Z; [X][Y][Z] d [i foo] a bar
  => → X Y Z; [X][Y][i foo] a bar      (dropped Z)
  => → X Y Z; [X]i foo [Y] bar         (apply a)
  => → X Y Z; X foo [Y] bar            (apply i)

Source: 7 (λ X) c a c a [*] a i -
View: 7 → X; [X] c a c a [*] a i -
  => 7 → X; [X] [X] a c a [*] a i -
  => 7 → X; X [X] c a [*] a i -
  => 7 → X; X X [X] [*] a i -
  => 7 → X; X X * [X] i -
  => 7 → X; X X * X -

It seems to work, though we may need to limit partial evaluations based on the patterns in ‘extract to argument’ to avoid evaluating parts of code irrelevant to the lambda.

Awelon programmers will have named locals where convenient without entangling names with syntax or semantics. This has me feeling more confident in my decision to favor concatenative combinators as a simpler base for computation than lambda calculus.

Addendum: To avoid unnecessary copying, we can introduce specialized rewrite rules for conditional behaviors of general form `T(X, [onT][onF]if) => [T(X, onT)][T(X, onF)]if`. I assume `if` will drop one branch and inline the other based on a boolean on the stack. Thus, I leave `X` on the stack rather than copy it into both branches.

Posted in Language Design, User Interface | Leave a comment

KPNs as an Effects Model

In this article, I motivate a variant of Kahn Process Networks (KPNs) with bounded channels as a better – more efficient, scalable, composable, securable, serializable, comprehensible, and debuggable – approach to modeling effectful application behavior. Especially in context of purely functional languages and conventions today.


In context of purely functional programming, Haskell pioneered a simple and effective approach to effects by simulation of imperative programming: model a data structure that returns either a final value, or yields with a (request, continuation) pair. By making this data structure opaque, inaccessible to user code, a compiler aware of the structure can optimize and inline requests heavily to achieve acceptable performance. In Haskell, this opaque data structure is called `IO`. (This trivial pattern is also called by obscure and scary names from category theory, but that hasn’t done Haskell any favors.)

One fundamental weakness of this design, however, is the singular nature of that `(request, continuation)` pair. Implicitly, this becomes a performance bottleneck: we’re performing only one ‘request’ at a time. Fortunately, we can work around this limitation in a few ways:

  • Asynchronous requests. Rather than awaiting a result immediately, we may continue with a ‘future’ reference for the result that can be used in a future request. The request itself may then be processed in parallel with ongoing computation.
  • Multi-threading. We can support a request to fork a thread that itself can make more requests. Further, we can support requests for communication between threads, e.g. via channels or shared mutable variable.

And this is essentially what Haskell does. We have our `forkIO` request, and a panoply of communication methods (IORefs, MVars, TVars, channels modeled with them, etc..). Asynchronous futures are frequently implicit to Haskell’s lazy evaluation model.

Unfortunately, these workarounds come at significant cost. Those futures and variables deeply entangle the application with its environment, which hinders serialization, persistence, and debugging. Multi-threading introduces non-deterministic behavior, which can hinder robust testing or reproduction of error conditions.

A second major weakness of this design regards the opacity of that `(request, continuation)` pair.

Assume two applications with opaque effects – `foo.main()` and `bar.main()`. We could compose these sequentially, or in parallel, or invoke one from within the other. However, we cannot control the interleave of requests. We cannot intercept requests in a mockup environment for testing, security, precise embedding. It is not possible to construct wrappers around an application to tune requests and responses for a different environment. It is not possible to cleanly halt an application. Ultimately, this hurts composition, testing, security, portability, and process control.

Again, there are workarounds – e.g. sandboxes, ad-hoc dependency injection, object capability security. But again, there is a cost – this time to those performance benefits we originally received from compiling with opaque effects.

Kahn Process Networks

KPNs are a simple model consisting of processes and channels. A process is limited to two actions: read from a channel, or write to a channel. A channel carries a FIFO queue of messages, and may have at most one reader and at most one writer. Reading from a channel will block if there is no message available.

Generally, writing to a KPN channel does not block. But I believe a variant of KPNs where each channel is annotated with a maximum capacity is superior for every practical use case. Bounded channels can be trivially implemented above normal KPNs by adding a reverse channels for acknowledgements to every message channel, then simply waiting for an ack after sending.

Construction and composition of KPNs is readily achieved in the style of Flow-Based Programming (FBP). Instead of directly naming a channel, a process will read or write a port. Separately, a configuration wires output ports from one process to input ports on another.

Aside: KPNs and FBP are similar. Some important differences: FBP has bounded channels by default, ad-hoc side-effects, and non-deterministic ‘merge’ for multiple writers into a shared wire.

KPNs can be used as an effects model.

Open (aka dangling) output channels can be filled with requests to be read by the KPN’s interpreter. Responses can be written to open input channels. A lightweight, declarative configuration can tell our interpreter from which channels it should read requests, and in each case to which channel it should write responses. This might be informally understood in terms of declaring a subset of channels as ‘effectful’ when wiring up our KPN processes.

These ‘effectful’ channels have some nice features:

  • Multiple effectful channels represent concurrent requests.
  • Multiple responses can easily be injected concurrently.
  • An individual effects channel naturally sequences its effects.
  • Requests and responses are buffered and batched up to channel capacity.
  • Internal process model supports parallelism and distribution.

This offers a significant improvement over the singular `(request, continuation)` pair implicit to imperative programming.

Instead of a peephole view of our computation, our effects interpreter may handle many concurrent requests and provide many concurrent responses, and our KPN’s continuation is inherent to the unblocking of processes when our environment drains full request channels or fills empty input channels. Because buffering, parallelism, and concurrency are pervasive, we don’t need workarounds like futures, threads, or shared variables for communication between threads.

By avoiding those workarounds that entangle evaluation with the environment, the resulting system becomes much simpler, easier to serialize and debug.

Note: Alternatively to effects channels, we could augment our process type to support requests on the environment. However, this can trivially be represented by adding two ports to each process (one to write requests, one to receive responses), then wiring an effectful channel between them, then immediately reading a response after each request. Effectful processes would be less expressive for asynchronous behavior, offer less opportunity to intercept and translate requests for a different environment, and cannot buffer more than one pending request per process. Effectful channels are by far the superior design.

KPNs in context of Purely Functional Programming

In context of purely functional programming, KPNs have more nice properties.

A KPN can easily be described by an immutable value – e.g. a collection of named processes and wires, with a list of messages pending on each wire. Granted, this may prove a fair bit more difficult to represent in statically typed PLs, due to the challenges surrounding heterogeneous collections. I expect clever people can get it done.

Evaluation of a KPN is pure and deterministic. This might be represented as a pure function, effectively `evalKPN : KPN → KPN`. Of course, squeezing the most parallelism and performance from the KPN may require special attention from our runtime or compiler.

Intriguingly, evaluation of a KPN can easily proceed in parallel with further actions upon it. This could be expressed by a variant on the evaluator. Something like: `stepKPN : StepKPN k a → KPN → k (a, KPN)`.

During stepKPN, our KPN is running in the background and evaluating, e.g. using conventional threads and queues. The final returned KPN is fully evaluated. But the `StepKPN` type would represent a program that may read and write channels, potentially modify the KPN (add or delete processes and wires, adjust wire capacity, etc..). Importantly, every action on the KPN is deterministic. For example, a read will return immediately if possible (if the channel has data), but if the channel is empty we must wait until input becomes available or until all upstream processes block. Besides actions on the KPN, StepKPN might support operations in some surrounding context `k`.

Effects can be supported either way.

With `evalKPN` we might use a simple tactic to repeatedly:

  • extract pending effect requests
  • inject all available responses
  • initialize evalKPN in parallel
  • process the extracted requests

This would give us frame-oriented effects, with big batches of effects being processed efficiently between frames. This is easy to reason about, and offers adequate parallelism on a single system. Unfortunately, it doesn’t scale very well to distributed systems. However, we could tweak this to model a partitioned KPN system, each partition running its own frames.

With `stepKPN` we can very precisely extract requests, perform effects, and inject responses in a fine-grained manner. This could work very well because we can tune our strategy for reading and writing of open channels based both on the problem and on meta-knowledge of the KPN’s structure.

A third option is viable: associate effects handlers, e.g. of type `request → IO response`, with each effectful channel. Then run the KPN in context of IO. This could maximize parallelism and scalability, albeit at a cost to deterministic ordering of concurrent effects and reproducible system behavior.

That last option would most generically enable KPNs to replace simple imperative IO as the effects model, since it would effectively have the full power of multi-threading but with advantages of inherent batching and reduced complexity.

Posted in Concurrency, Language Design, Modularity, Open Systems Programming, State | Tagged , , , | 8 Comments