Hierarchical Transaction Machines

Hierarchical transaction machines (HTXMs) are an interesting idea I’ve been developing recently. HTXMs accomplish my goals of live, resilient, declarative, extensible software systems. Further, HTXMs accomplish these goals in a way that’s easy to integrate with existing service and devices, thus avoiding the major weakness of RDP.

First, let’s ignore that ‘hierarchical’ adjective. I’ll get back to it!

A transaction machine (TXM) is essentially: a transaction, repeated indefinitely. We assume our system has a set of TXMs.

TXMs intrinsically support extension and liveness: We can implicitly read our TXMs behavior at the start of each transaction. This allows us to modify our system behavior atomically, adding new machines and removing others. If necessary, we can simultaneously transition our state resources. Changes always apply immediately. Users and administrators can directly control their systems via definitions or other variables such as configuration flags.

TXMs intrinsically have nice resilience properties: transactions are fail-safe, they can degrade gracefully by fallback behavior or halting individual machines, and the system will recover swiftly after a problem is corrected. Further, they can easily support a user interface with consistent views and transactional intervention, allowing for administration or debugging.

TXMs intrinsically are idempotent: having more than one copy of a TXM won’t change a system’s behavior. This allows us to reason simply about whether or not a system has a particular behavior. A list of TXMs is also commutative, but that’s not special for concurrency models.

TXMs intrinsically support reactive, concurrent coordination behavior: If a deterministic transaction is unproductive, then repeating it under the same conditions will also be unproductive. Thus, we should wait for observed variables to change. We cannot detect all unproductive behaviors, but we can easily detect those obviously unproductive read-only or aborted transactions. Programmers can leverage this for ad-hoc coordination, e.g. a stream-processing TXM could abort if it reads an empty input queue or full output queue. (Aside: Haskellers might already be familiar with this technique, via the software transactional memory package with use of TMVar, TQueue, or TChan.)

Beyond stream processing, it’s easy for TXMs to model reactive constraint systems (maintaining cells according to some goal), blackboard systems (read blackboard, opportunistically contribute), or publish-subscribe. Multiple models can be easily mixed. We can easily hide transaction variables behind friendly APIs, modeling an object-oriented system.

External systems, such network or display, can be modeled via asynchronous interaction with system variables. For example, we can write a request to a system task queue. The request would include a variable to receive the response. Then, we can wait for the response in a subsequent transaction. Compared to conventional imperative effects APIs, the main limitation is that we cannot directly model synchronous waits. To wait for the response before we commit the request is doomed to fail! But it’s already common to use asynchronous APIs for external effects, so I think this shouldn’t be a problem.

These are all wonderful properties, and achieve several of my goals.

However, I have concerns about support for divide-and-conquer tactics and fine-grained task parallelism. Assuming we have a hundred network connections, it is very convenient to declare a transaction machine to handle each one. It is technically trivial to spawn additional machines as a system request. Unfortunately, using anonymous detached machines, we too easily lose those benefits of liveness, extensibility, and user-control.

The idea of ‘attached’ machines leads me to Hierarchical TXMs (HTXMs).

HTXMs form a tree of read-only transactions, with read-write loops at the leaves. Each transaction in an HTXM will be read-fork or read-write, but not both. Forked machines have a clear life-cycle: they survive only until a parent HTXM (or transitive ancestor) must be recomputed. We can model detached machines by attaching to a parent or sibling node via writing to a shared collection. With HTXMs, users have a lot of control over this tree structure.

For performance, then, we’ll want to stabilize our tree structure by stabilizing the variables upon which it depends. Variables read close to tree root should be the most stable. Fortunately, there are many patterns programmers can use to control stability, e.g. using variables that compute and cache relatively stable states in intermediate variables.

I haven’t implemented HTXMs yet. But I believe this model has much to offer, and I’ll continue exploring it for now.

Advertisements
Posted in Language Design | Leave a comment

RDP Simplified

Reactive Demand Programming (RDP) is a reactive programming model I developed circa 2010-2012. The essential idea behind RDP is that “effects” should be commutative and continuous.

What I mean by “continuous” is that behaviors must be expressed statefully, attached to stateful representations. Continuity makes it easy to observe and directly manipulate system behavior, which is valuable from a human-in-the-loop perspective. Meanwhile, commutativity means that the order of expression for effects must be irrelevant. This property simplifies reasoning about concurrent behaviors, both within a program and for multi-agent systems.

To achieve commutative and continuous effects, RDP uses a concept of “demands“. A demand is a continuous request from a program (or subprogram), which may be observed and fulfilled by other programs (or subprograms). Demands are only observable collectively, as a set, not individually.

Unfortunately, my original expression of RDP was complicated by three factors:

  • Expression of RDP programs was arrowized, which is relatively painful to work with when compared to monadic programming.
  • I wasted efforts to support “instantaneous” reactive loops, i.e. such that we can observe effects without logical delay.
  • I required idempotence, such that duplicate expression of effects do not contribute to system behaviors.

Monadic expression of RDP programs is not difficult, especially if we remove instantaneous loops. A simplified monadic RDP API:

typeclass (Monoid a) ⇒ Bag a // requires commutative mappend
typeclass (Monad m) ⇒ RDP m where
    type Var m : * → *
    type Dir m : *
    dir : Dir m → String → m (Dir m)
    var : (Bag a, Typeable a) ⇒ Dir m → String → m (Var m a)
    get : Var m a → m a
    put : Var m a → a → m ()

Unlike conventional monadic effect loops, an RDP monadic behavior is applied contininuously, repeating at each time-step. We have a notion of persistent, labeled variables that are available from one time-step to the next. When we get a variable, we’ll receive the value from the previous time-step. When we put, we contribute our value to the next time-step. All contributions within a time-step are unioned together, via commutative mappend. 

By relaxing idempotence – that is, by using bags of demands instead of sets – we can add entries together. This is convenient for modeling votes, priorities, weighted preferences.  We can similarly model negation or suppression, which is convenient for modeling rules or constraints (see Behavioral Programming for examples of how this might be useful).

Although we do not allocate ‘new’ variables, a notion of local state is still valuable for RDP systems. For the API above, I borrow the concept of filesystem-like directories. Local state is achieved by assigning a distinct subdirectory to each distinct module or subprogram, such that subprograms cannot accidentally (or maliciously) interfere with each other. We can feasibly support object-capability patterns involving sharing of vars. Of course, a major goal of RDP is to preserve access to stable views of state for monitoring, extension, debugging, etc.. Thus, local state is never hidden from the parent process.

Effects in RDP system are simply modeled in terms of shared variables between a subprogram and its environment – which at the top-level should be a runtime or operating system. A subprogram can output demands – e.g. to display a window, to move a robot – into a variable that the environment observes. The environment can provide inputs into a variable that our subprogram observes. (The specific variables involved can easily be hidden behind a record of monadic operations/capabilities.) Of course, providing all possible inputs at all times would be inefficient. But we can leverage flexible patterns that involve registering available services or resources, or use of outputs to demand specific inputs.

In this sense, RDP has some similarities with tuple spaces or blackboard systems, albeit with RDP variables being more fine-grained than blackboards. There is also a similarity with functional relational programming, as described in Out of the Tarpit by Ben Mosely.

Implementing RDP efficiently remains a challenge. In general, we could try to optimize evaluation of an RDP subprogram by memoizing its partial evaluation and only re-evaluating sub-computations whose observable results have changed. This might require an explicit memoization combinator, with a comparable result type. Extending the monad with for explicit fork-join concurrency (e.g. fork : m a → m b → m (a * b) and fork_ : m () → m b → m b) may also help in isolating evaluation of subprograms. However, an efficient implementation of a monadic RDP API should at least be simpler than efficient implementation of arrowized RDP, by virtue of monads being simpler than arrows.

Posted in Language Design | 6 Comments

Monadic RDP

Reactive Demand Programming (RDP) is a programming model I developed circa 2010-2012. I’ve tabled it for a few years to work on other project goals, and since then I’ve also become interested in Kahn Process Networks (KPNs) as a simpler alternative in some ways. But RDP still is an interesting paradigm, with features convenient for live programming and open, distributed systems.

The essential idea for RDP is simply this: concurrent effects should be commutative and idempotent.

That is, expressing a specific request twice is the same as expressing it one or three times. If we have two different requests at the same logical time, the order in which we evaluate or observe them should not matter. With these properties, we preserve a lot of the nice equational reasoning and refactoring features associated with purely functional programming, and a lot of declarative reasoning is feasible. Essentially, effects are sets of declarations for input or output.

There are also some secondary ideas developed for RDP, such as precise logical timing. You can search this blog for more on the subject.

Anyhow, the original RDP models are arrowized, but monadic expression of behaviors would likely be more convenient.

Instead of normal imperative variables, which have one value at a time, our monadic RDP system will only have set variables, each with a label. Writing a variable simply adds a given value to the set. Reading the variable will return the prior value at the label, which will be a set.

The sets observed upon reading a variable must have deterministic ordering and duplicates removed, so we only support comparable value types. Actually, rather than sets we could use any commutative, idempotent monoid type. But sets are the most obvious such type.

Like RDP arrowized behaviors, our monad value would be applied continuously, corresponding to a stateless behavior applied in just one time-step. Like arrowized RDP, we could also support a “delay” operation that enables RDP behaviors to be distributed over multiple time-steps, but we might need to constrain the number of delay steps (otherwise we end up with untethered, fire-and-forget RDP behaviors – which is interesting, but undermines any live programming features). Effects would be modeled in terms exposing a subset of variables for external input or output. We might also add a few auxiliary features, such as single-assignment (per time step) labels, value suppression, forking of behavior threads, hierarchical structure for labels, and so on.

Posted in Concurrency, Distributed Programming, Language Design, Live Programming, Open Systems Programming, State | Leave a comment

Awelon Dictionary

A goal of Awelon project is to make software artifacts easy to share, modify, interact with, and compose. Sharing is easiest for documents, images, and similar artifacts that have no external dependencies. For Awelon, I apply this concept to the codebase.

An Awelon codebase is called a “dictionary”. It’s a big set of definitions that are versioned, curated, and maintained together. A dictionary does not depend on external definitions beyond the few Awelon primitives. This opposes the conventional “library” model where we use ad-hoc dependency and configuration managers to glue together interdependent subsets of definitions.

An Awelon dictionary is not limited to defining functions. It can easily contain definitions for data – maps, music, almanacs, articles, paintings, presentations, fonts, forums, books, or blogs. Binary objects may be embedded indirectly via %secureHash references.  Effectively, a dictionary serves as a “smart” filesystem with potential support for typechecked data, partial evaluations, transparent procedural generation, and spreadsheet-like views. In some cases, soft real-time updates are possible, e.g. adding telemetry data for moving objects.

(Aside: We could follow Haskell’s example of modeling a process with a `main` function and monadic effects detached from dictionary state. However, Awelon project is exploring alternative application models to make computation more accessible and visible. Software agents recording application state in a codebase seems a promising basis.)

To share entire codebases, we may logically embed one dictionary within another, such that dict/foo represents access to foo under the named dictionary. This structure avoids naming conflicts and securely controls dataflows – i.e. data can only flow outwards from `dict/` to its host. However, because dictionaries have no external dependencies, they’ll tend to replicate common function definitions (such as arithmetic and list processing).

Ultimately, Awelon dictionaries are massive, widely shared, largely similar, and frequently updated. In context of use, I desire incremental compilation that is optimally shared with equivalent definitions in similar dictionaries.

It took me a long while to develop a suitable representation.

I eventually found inspiration from log-structured merge-trees, radix trees, and use of secureHash for acyclic reference between binaries. Proposed representation:

/prefix1 secureHash1
/prefix2 secureHash2
:symbol1 definition1
:symbol2 definition2
~symbol3

Each dictionary node is represented as a series of updates. We can either update a symbol (defining or deleting it, `:` or `~` respectively) or an entire prefix `/`. In the latter case, we use a secure hash to identify another dictionary node, and we strip the matched prefix from symbols within that node.

To find the definition for a symbol, only the last update with matching symbol or prefix is examined. Hence, this representation can be used as an append-only log for streaming updates. For example, an update to `/p` will mask prior updates to `:poke` and `/prod`. We assume the new node at `/p` will contain those definitions if we intend to preserve them. The empty prefix (as in `/ secureHash`) is valid and useful. It masks all prior updates and essentially represents a checkpoint or reset in context of a stream, or a prototype inheritance as the first update in a node.

Hierarchical dictionaries may be naively included. Using `/dict/ secureHash` we can logically copy an entire dictionary. The radix tree structure that strips matched prefixes is essential for structure sharing. With `:dict/foo def` we can deeply update an individual definition.

Because recent updates are buffered near the root node, we effectively have lightweight working sets with near O(1) update. Because inner nodes are identified by secure hash, we obtain a lot of structure sharing and reasonably efficient diff computations to help maintain a cache of evaluated or compiled forms.

It seems this very simple dictionary representation can support most of my goals. I can even feasibly support distributed storage and lazy download of secure hash references, so clients can work with dictionaries much larger than a single disk.

Posted in Language Design | Leave a comment

Awelon Labeled Data

Awelon is a minimalist language, with four confluent concatenative combinators:

[B][A]a == A[B]     (apply)  
[B][A]b == [[B]A]   (bind)
   [A]c == [A][A]   (copy)
   [A]d ==          (drop)

The motivation for minimalism is simplicity. Awelon project’s goals include that every feature be simple, comprehensible, trustable – from language primitives up to multi-agent software systems (via shared monotonic dictionaries, etc.). So I construct programs from these few primitives. Of course, there are also some disadvantages for performance and syntax. For Awelon, these disadvantages are ameliorated by use of acceleration and projectional editing.

Originally, my intention has been to leverage acceleration and projectional editing even for labeled data – records and variants – in Awelon.

The main benefits of labeled data are extensibility, commutative structure for refactoring, and human-meaningful documentation. Labels can be modeled as paths through a trie, where the path encodes the label. Unfortunately, I’ve discovered that benefits of labeled data seem to be weakened when we choose a concrete encoding of labels and records. It becomes difficult to reason about which observations can be made on a label or record, or how accessors and records might be constructed dynamically. It’s even difficult to preserve meaningful labels when reporting type errors.

I’ve chosen to provisionally add label-primitives to Awelon that work as follows:

[[A] :a [B] :b [C] :c] .b == [B] [[A] :a [C] :c]
[A] :a [B] :b == [B] :b [A] :a  (labels commute)

Logically, these operate linearly on row-polymorphic abstract record constructors:

:a   :   {R without a} [A] :a → {R, a:[A]}
.a   :   [{} → {R, a:[A]}] → [A] [{} → {R}]

The actual record value is abstract and ephemeral, never represented within Awelon code or evaluations. Clients can treat `[[A] :a [B] :b]` as a record. But it’s simultaneously a normal function subject to ad-hoc composition, abstraction, and factoring. Awelon’s abstract record model forbids shadowing of labels and has linear access by default. Records don’t provide any features we couldn’t equally achieve using tuples `[[A][B]]` with extrinsically labeled positions.

For labeled variants, consider that basic sum type `(A+B)` has a Church encoding `∀r.(A→r)→(B→r)→r`. That is, an observer will supply a “handler” for each case, and the value itself selects and applies one handler, dropping the others. For labeled sums, we simply need a labeled product of handlers – `[[OnA]:a [OnB]:b [OnC]:c …]`. Variant values could be given concrete representation such as `[.label [Value] case]`. We need projectional editing to make this pretty, but the primitive syntax is reasonably compact, self-documenting, and close to the intended meaning.

I’m somewhat reluctant to add this primitive concept to Awelon language. I feel labeled data is sufficiently valuable for extensibility, commutativity, and human-meaningful documentation to warrant some special treatment. But there are also some risks – use of labels adds a lot of bloat to data representations, for example. Labels themselves will tend to bloat data, and introduce pressure for naming of things. Many data structures such as lists or trees can get by with just a few labels, so positions work well enough. This features currently has a “provisional” status in case I find a better approach in the future.

Posted in Language Design | Leave a comment

Lightweight Staged Programming for Awelon

Multi-stage programming is essentially disciplined partial evaluation, a widely useful optimization technique especially in context of DSLs. By ‘disciplined’ I mean programmers can comprehend and control when computations occur, and statically debug staging errors. This is valuable for avoiding hidden performance regressions.

Yesterday, I was inspired and found a simple way to support multi-stage programming in Awelon by introducing two annotations:

[F](now) == [F]     iff [F] doesn't contain (now)
[F](later)          masks F from analysis by (now)

For Awelon, partial evaluation is simplified by lack of entanglement with variable environments and local rewrite semantics.  Hence annotations do not need to drive partial evaluation. They only offer means to declare requirements and debug errors – provide some discipline. I can separately develop models (DSLs, monads, etc.) for convenient multi-stage programming.

The (now) annotation is essentially an assertion that peeks deep into otherwise opaque representations. The (later) annotation enables abstraction of staging requirements and working with more than two stages. Global function definitions should be implicitly staged, such that unmasked (now) indicates static evaluation in that context, and we can safely assume named functions are statically stage safe after compilation.

These annotations can be checked dynamically, but it’s feasible to statically compute whether runtime staging will be correct, assuming we know the arities of all the things. Our inferred types would need to record for each input whether any (now) annotation will be exposed due to partial evaluation if just that many inputs are provided.

Aside: Use of (now) and (later) is inspired from temporal logic. Temporal logic is a a natural fit for multi-stage programming, but I think most languages wouldn’t support such a lightweight adaptation.

Posted in Language Design | Leave a comment

Labeled Data: Records and Variants

Records and variants are basic labeled data types. A record might be expressed as `{x:12.1, y:13.7}`, having a value for each label. A variant might be `Damage:11.3` or `Say:”Hello”`, and involves selection of one label and an associated value. We can generally combine these to have a variant of records or a record of variants. Labeled data is convenient insofar as they are self-documenting (human meaningful) and extensible. Most modern programming languages support labeled data types as a built-in feature. A few have explored fluent extensibility under the names “row polymorphism” or “polymorphic variants”.

But assume we have a much simpler language. This language readily supports simple binary products (a*b) and sums (a+b). A binary product is a simple pair of values like `{fst:12.1, snd:13.7}` while a binary sum is a simple choice of two like `Left:11.3` or `Right:”Hello”`. Effectively, we have spatial labels that are neither human meaningful nor extensible. For my Awelon project, this isn’t a hypothetical scenario. Awelon can readily Church encode binary products and sums, and can even provide a conventional syntactic projection, but has no built-in type-level support for labeled data.

Given simple binary products and sums, can we recover the benefits of built-in labeled data types?

I think we can do that and better. But first a short detour. Type `0` (aka void) has no values (perhaps excepting `error`). Type 0 is the identity type for sums, in the sense that type `(0+a)` or `(a+0)` is equivalent in many ways to just `a`. Similarly, type `1` (aka unit) is an identity type for products, and `(a*1)` or `(1*a)` is equivalent in many ways to just `a`. But there are some important *structural* differences. Any non-error value from `(0+a)` will be encoded as `Right:_`. With deep sum type structures such as `(0+(0+((0+(a+0))+0)))` we end up encoding values as `Right:Right:Left:Right:Left_`.

One idea, then, is to encode our label within this Left-Right-Left sequence. One viable encoding is `(RL|RR)*LL` where `RL` means `Right:Left:` and encodes a zero-bit, `RR` encodes a one-bit, and `LL` encodes sequence stop. (This particular encoding has a convenient property of being simple and self synchronizing.) Given our string of ones and zeroes, we may subsequently encode labels using ASCII or UTF-8. This gives us a basis for variants. Similarly, records can be represented by encoding a trie on the label we use for variants. For the earlier example, directly encoding the path would give us a corresponding structure `(1*(1*((1*(a*1))*1)))`. (We might explore alternative encodings to make it easier to extend or enumerate a record.)

Those 0 and 1 types essentially represent the ‘extensibility’ options for our variants and records. We get more choices for variants, more slots for records, while the label encoding itself ensures unbounded extensibility. Conveniently, we can also use a variant label to access a record, which provides a simple basis for ‘case’ or ‘switch’ expressions. We do additionally need syntactic sugar for working with these otherwise onerous structures, and likely a compiler that accelerates them. But there is little need for special type level support. And by modeling these records and variants, we can more readily get various polymorphic features, and the ability to work with labels themselves as first-class, typeable structures.

Anyhow, while I found this idea interesting, and I haven’t seen other people approaching labeled data this way, I wouldn’t be surprised if it’s mentioned in papers older than I am. I’ll be pursuing this further in Awelon project.

Posted in Language Design, Types | Tagged , , , , , , | Leave a comment

Reactive Process Networks

Kahn Process Networks are a very nice sweet spot for functional purity and concurrency, and are widely applicable. I’ve recently considered them as a viable alternative to monadic effects models. However, KPNs are missing an important feature: the ability to merge asynchronous inputs from different channels. There is no consistent sense of time for messages within a KPN.

Fortunately, it is easy to add time to KPNs, and thus have reactive process networks:

  • Every process and message is stamped with an implicit time value.
  • Channels have logical latency, adding that latency to time stamps.
  • READ may return Nothing if no messages are available at the process time.
  • WRITE will stamp the outgoing message with the process’s current time.
  • In addition to READ and WRITE, a process may WAIT on a set of channels.
  • WAIT will advance the process time to the earliest message in the set.

For efficient distributed implementation, we’d implicitly use synchronization messages to report the advance of logical time for a waiting process. Intriguingly, it is feasible to perform some optimistic concurrency: determine likely future outputs on an assumption that no last-second messages on open input ports will change things. And of course, we’ve also solved the expressive weakness of KPNs. We can merge asynchronous messages into a stream, or represent clock-like behaviors – loops with latency.

Another interesting point is that basic KPNs don’t support bounded-buffer ‘push back’ because it can result in datalock. We can implement them that way and accept the risk, but it isn’t part of Kahn’s original model. But with reactive KPNs, we don’t as much need bounded-buffers because we can bound computations in ‘time’, e.g. by evaluating a KPN up to a given time or controlling the advance of time for open input channels.

The time stamps are logical, and are never directly observed by a processes. Time stamps and latencies may use simple natural numbers. We can ‘normalize’ time stamps within a network, e.g. by subtracting the smallest time stamp in the network from all time stamps in the network. For integration with real-time systems, our logical time stamps may have a simple mapping to wall-clock times, such as logical microseconds. We can also use logical latencies to guide physical distribution when evaluating large process networks.

The original KPN behavior is supported. We can always WAIT before READ. Or we can simply use zero-latency wiring and never advance time on open input channels. While READ does not logically WAIT, it must still implicitly wait until we determine whether or not more messages will arrive at a process’s current time.

Reactive process networks resolve my few remaining concerns with using KPNs as both a primary effects model and as a foundation for first-class high-performance distributed programming within purely functional computations. Reactive KPNs could also serve as an excellent substrate for my reactive demand programming model.

Posted in Concurrency, Distributed Programming, Language Design, Reactive Demand Programming | Tagged , | Leave a comment

Lambdas in Tacit Code

While tacit programming is an excellent default, it is not suitable for all use cases. Attempting to use tacit programming everywhere results in pointlessly onerous data shuffling. A few local names, in the form of lambdas and let expressions and lexical closures, can simplify a useful subset of code, making it easier to read and write. Many tacit languages, including Factor and Kitten, provide an ‘escape’ from the tacit style. Factor uses a macro rewrite, while Kitten treats this as a primitive feature with support from the syntax and runtime.

I am developing a tacit concatenative language, Awelon. Unlike conventional programming languages, Awelon takes an approach of storing code with a simple, uniform syntax in a database, and minimizing the core language. Awelon has become considerably simpler since last time I discussed it on this blog. If you’re interested, I encourage you peruse the linked documentation. The core of Awelon is just four primitive combinators:

Awelon Primitive Combinators
[B][A]a == A[B]           (apply)
[B][A]b == [[B]A]         (bind)
   [A]c == [A][A]         (copy)
   [A]d ==                (drop) 

Awelon additionally has syntactic sugar for natural numbers and text literals, and parenthetical annotations like `(nat)`, `(memo)`, or `(par)` for performance and debugging. But all observable behavior and data is ultimately defined in terms of the four combinators. All data is logically Church or Scott encoded as functions, even if accelerated under the hood. Because Awelon’s syntax and semantics is very simple at the storage layer, I leverage editable views for presenting a more sophisticated view to the programmer. For example, a proposed view might expand decimal number `3.141` to `[[3141 0 integer] 3 decimal]`. And `-4/6` to `[[0 4 integer] 6 rational]`.

For Awelon, an escape from the tacit style should also be an editable view – a bit like Factor’s macro expansion, but reversible because I’ll be storing the tacit representation. How to accomplish this, exactly, didn’t really click for me until yesterday when I was reviewing the Wikipedia article on combinatory logic and specifically the translation from lambda calculus to SKI combinators. I can apply a similar translation to extract lambda variables from Awelon code:

Extract to Argument
   T(X,E) extracts X from E
   ∀X,E. [X] T(X,E) == E
   T(X,E) does not contain X

T(X, E) | E does not contain X => d E
T(X, X) => i  (where [A]i == A)
T(X, [X]) =>
T(X, [E]) => [T(X,E)] b
T(X, F G)
  | only F contains X => T(X,F) G
  | only G contains X => [F] a T(X,G)
  | otherwise  => c [T(X,F)] a T(X,G)

We can implement i = [][]b a a d

This is a simple algorithm that generates efficient code. The translation is ‘reversible’ by partial evaluation of `[X] T(X,E)` to propagate the variable back into the rendered view. The rules can be simplified a bit (no need for `i`) if we assume X is a value and instead extract to `X T(X,E)`.

I find Factor’s syntax for named locals to be headache inducing. But Kitten’s `→ X;` assignment syntax is reasonably aesthetic and effectively supports both lambdas and let expressions. Consider an adaptation of Kitten’s syntax for named locals to Awelon:

View: → X; X
Source: (λ X) i

View: → X Y Z; X foo [Y] bar
Source: (λ X Y Z) d [i foo] a bar

View: 7 → X; X X * X -
Source: 7 (λ X) c a c a [*] a i -

Here the `(λ X)` is an annotation or comment that tells our editable view that we intend to present the code as a lambda expression with a specific variable name ‘X’. The details of its representation aren’t essential. What does matter is that we can use the comment to systematically recover the view from source:

Source: (λ X) i
View: → X; [X] i  
  => → X; X

Source: (λ X Y Z) d [i foo] a bar
View: → X Y Z; [X][Y][Z] d [i foo] a bar
  => → X Y Z; [X][Y][i foo] a bar      (dropped Z)
  => → X Y Z; [X]i foo [Y] bar         (apply a)
  => → X Y Z; X foo [Y] bar            (apply i)

Source: 7 (λ X) c a c a [*] a i -
View: 7 → X; [X] c a c a [*] a i -
  => 7 → X; [X] [X] a c a [*] a i -
  => 7 → X; X [X] c a [*] a i -
  => 7 → X; X X [X] [*] a i -
  => 7 → X; X X * [X] i -
  => 7 → X; X X * X -

It seems to work, though we may need to limit partial evaluations based on the patterns in ‘extract to argument’ to avoid evaluating parts of code irrelevant to the lambda.

Awelon programmers will have named locals where convenient without entangling names with syntax or semantics. This has me feeling more confident in my decision to favor concatenative combinators as a simpler base for computation than lambda calculus.

Addendum: To avoid unnecessary copying, we can introduce specialized rewrite rules for conditional behaviors of general form `T(X, [onT][onF]if) => [T(X, onT)][T(X, onF)]if`. I assume `if` will drop one branch and inline the other based on a boolean on the stack. Thus, I leave `X` on the stack rather than copy it into both branches.

Posted in Language Design, User Interface | Leave a comment

KPNs as an Effects Model

In this article, I motivate a variant of Kahn Process Networks (KPNs) with bounded channels as a better – more efficient, scalable, composable, securable, serializable, comprehensible, and debuggable – approach to modeling effectful application behavior. Especially in context of purely functional languages and conventions today.

Background

In context of purely functional programming, Haskell pioneered a simple and effective approach to effects by simulation of imperative programming: model a data structure that returns either a final value, or yields with a (request, continuation) pair. By making this data structure opaque, inaccessible to user code, a compiler aware of the structure can optimize and inline requests heavily to achieve acceptable performance. In Haskell, this opaque data structure is called `IO`. (This trivial pattern is also called by obscure and scary names from category theory, but that hasn’t done Haskell any favors.)

One fundamental weakness of this design, however, is the singular nature of that `(request, continuation)` pair. Implicitly, this becomes a performance bottleneck: we’re performing only one ‘request’ at a time. Fortunately, we can work around this limitation in a few ways:

  • Asynchronous requests. Rather than awaiting a result immediately, we may continue with a ‘future’ reference for the result that can be used in a future request. The request itself may then be processed in parallel with ongoing computation.
  • Multi-threading. We can support a request to fork a thread that itself can make more requests. Further, we can support requests for communication between threads, e.g. via channels or shared mutable variable.

And this is essentially what Haskell does. We have our `forkIO` request, and a panoply of communication methods (IORefs, MVars, TVars, channels modeled with them, etc..). Asynchronous futures are frequently implicit to Haskell’s lazy evaluation model.

Unfortunately, these workarounds come at significant cost. Those futures and variables deeply entangle the application with its environment, which hinders serialization, persistence, and debugging. Multi-threading introduces non-deterministic behavior, which can hinder robust testing or reproduction of error conditions.

A second major weakness of this design regards the opacity of that `(request, continuation)` pair.

Assume two applications with opaque effects – `foo.main()` and `bar.main()`. We could compose these sequentially, or in parallel, or invoke one from within the other. However, we cannot control the interleave of requests. We cannot intercept requests in a mockup environment for testing, security, precise embedding. It is not possible to construct wrappers around an application to tune requests and responses for a different environment. It is not possible to cleanly halt an application. Ultimately, this hurts composition, testing, security, portability, and process control.

Again, there are workarounds – e.g. sandboxes, ad-hoc dependency injection, object capability security. But again, there is a cost – this time to those performance benefits we originally received from compiling with opaque effects.

Kahn Process Networks

KPNs are a simple model consisting of processes and channels. A process is limited to two actions: read from a channel, or write to a channel. A channel carries a FIFO queue of messages, and may have at most one reader and at most one writer. Reading from a channel will block if there is no message available.

Generally, writing to a KPN channel does not block. But I believe a variant of KPNs where each channel is annotated with a maximum capacity is superior for every practical use case. Bounded channels can be trivially implemented above normal KPNs by adding a reverse channels for acknowledgements to every message channel, then simply waiting for an ack after sending.

Construction and composition of KPNs is readily achieved in the style of Flow-Based Programming (FBP). Instead of directly naming a channel, a process will read or write a port. Separately, a configuration wires output ports from one process to input ports on another.

Aside: KPNs and FBP are similar. Some important differences: FBP has bounded channels by default, ad-hoc side-effects, and non-deterministic ‘merge’ for multiple writers into a shared wire.

KPNs can be used as an effects model.

Open (aka dangling) output channels can be filled with requests to be read by the KPN’s interpreter. Responses can be written to open input channels. A lightweight, declarative configuration can tell our interpreter from which channels it should read requests, and in each case to which channel it should write responses. This might be informally understood in terms of declaring a subset of channels as ‘effectful’ when wiring up our KPN processes.

These ‘effectful’ channels have some nice features:

  • Multiple effectful channels represent concurrent requests.
  • Multiple responses can easily be injected concurrently.
  • An individual effects channel naturally sequences its effects.
  • Requests and responses are buffered and batched up to channel capacity.
  • Internal process model supports parallelism and distribution.

This offers a significant improvement over the singular `(request, continuation)` pair implicit to imperative programming.

Instead of a peephole view of our computation, our effects interpreter may handle many concurrent requests and provide many concurrent responses, and our KPN’s continuation is inherent to the unblocking of processes when our environment drains full request channels or fills empty input channels. Because buffering, parallelism, and concurrency are pervasive, we don’t need workarounds like futures, threads, or shared variables for communication between threads.

By avoiding those workarounds that entangle evaluation with the environment, the resulting system becomes much simpler, easier to serialize and debug.

Note: Alternatively to effects channels, we could augment our process type to support requests on the environment. However, this can trivially be represented by adding two ports to each process (one to write requests, one to receive responses), then wiring an effectful channel between them, then immediately reading a response after each request. Effectful processes would be less expressive for asynchronous behavior, offer less opportunity to intercept and translate requests for a different environment, and cannot buffer more than one pending request per process. Effectful channels are by far the superior design.

KPNs in context of Purely Functional Programming

In context of purely functional programming, KPNs have more nice properties.

A KPN can easily be described by an immutable value – e.g. a collection of named processes and wires, with a list of messages pending on each wire. Granted, this may prove a fair bit more difficult to represent in statically typed PLs, due to the challenges surrounding heterogeneous collections. I expect clever people can get it done.

Evaluation of a KPN is pure and deterministic. This might be represented as a pure function, effectively `evalKPN : KPN → KPN`. Of course, squeezing the most parallelism and performance from the KPN may require special attention from our runtime or compiler.

Intriguingly, evaluation of a KPN can easily proceed in parallel with further actions upon it. This could be expressed by a variant on the evaluator. Something like: `stepKPN : StepKPN k a → KPN → k (a, KPN)`.

During stepKPN, our KPN is running in the background and evaluating, e.g. using conventional threads and queues. The final returned KPN is fully evaluated. But the `StepKPN` type would represent a program that may read and write channels, potentially modify the KPN (add or delete processes and wires, adjust wire capacity, etc..). Importantly, every action on the KPN is deterministic. For example, a read will return immediately if possible (if the channel has data), but if the channel is empty we must wait until input becomes available or until all upstream processes block. Besides actions on the KPN, StepKPN might support operations in some surrounding context `k`.

Effects can be supported either way.

With `evalKPN` we might use a simple tactic to repeatedly:

  • extract pending effect requests
  • inject all available responses
  • initialize evalKPN in parallel
  • process the extracted requests

This would give us frame-oriented effects, with big batches of effects being processed efficiently between frames. This is easy to reason about, and offers adequate parallelism on a single system. Unfortunately, it doesn’t scale very well to distributed systems. However, we could tweak this to model a partitioned KPN system, each partition running its own frames.

With `stepKPN` we can very precisely extract requests, perform effects, and inject responses in a fine-grained manner. This could work very well because we can tune our strategy for reading and writing of open channels based both on the problem and on meta-knowledge of the KPN’s structure.

A third option is viable: associate effects handlers, e.g. of type `request → IO response`, with each effectful channel. Then run the KPN in context of IO. This could maximize parallelism and scalability, albeit at a cost to deterministic ordering of concurrent effects and reproducible system behavior.

That last option would most generically enable KPNs to replace simple imperative IO as the effects model, since it would effectively have the full power of multi-threading but with advantages of inherent batching and reduced complexity.

Posted in Concurrency, Language Design, Modularity, Open Systems Programming, State | Tagged , , , | 8 Comments