Declarative State

State models favored for imperative programming, such as mutable references, are unsuitable for declarative programming models including Reactive Demand Programming (RDP). Declarative state is not difficult or complicated, but it is different and might require its own eureka moment.

The general scenario is that we are manipulating state by use of declarations.

As discussed in a previous article, declarations are statements that may hold over a period of time, and may overlap in time. (Contrast events, which hold only for an instant.) Declarations that hold at any given instant must be commutative and idempotent. (Contrast events, generally processed in some order and may be counted.)

To help elucidate some of the constraints on declarative state, I break the general scenario into four quadrants on two axes – exclusive vs. concurrent writers, and discrete vs. continuous time.

Exclusive Writer, Discrete Time

By exclusive writer, I mean there is at most one declaration for how to write state. And discrete time means that there is a well defined `next` time that is distinct from the `current` time.

This is the easy case.

There is no risk of write conflicts. There is no risk of semantic confusion with regards to duplicate writes. The functions for manipulating state are simple – we just declare the next state to be a function of the current state and other observations the writer might have made. Example declarations might be `next = curr + 2` or `next = curr * 3`.

There will also be some default `idle` behavior for state – most likely `next = curr` or `next = 0`; the idle behavior will apply if there is no other declaration. (The auto-zero behavior would have some nice properties with regards to garbage collection.)

Concurrent Writers, Discrete Time

Concurrent writers introduce two challenges relative to the exclusive writers model: write declarations conflict, and instability from idempotence.

Write Declarations Conflict

Taking the earlier example, we might see BOTH declarations – `next = curr + 2` and `next = curr * 3` – at the same time. These proposed operations clearly do not commute – i.e. (curr+2)*3 is always 4 points different from (curr*3)+2. So we can’t just apply them in an arbitrary order.

Fortunately, we don’t actually need the operations to commute. For declarative definition, we need the declarations to commute.

Essentially, at any given instant we have a set of declarations from all attempting to influence the shared state resource. This set is gathered from all writers, local and distributed. This set will ultimately be represented in some ordered manner in physical memory, but the next state must not depend on that ordering nor on any properties that might have influenced it (such as race conditions, origin, position in source code).

So long as we obey that restriction, we have a lot of freedom on how to resolve conflicts. A few valid options include:

  1. Report a `write conflict` and perform the idle behavior. Push the burden of avoiding concurrent operations back on the developers.
  2. Choose one operation to win based on some arbitrary condition that is NOT related to ordering in the set. For example, execute both then choose whatever option results in the smaller value.
  3. Apply a true-random selection. This, perhaps counter to intuition, is valid because the cause of the ordering is independent of the declarations. Though, I’d disfavor this option because it unnecessarily introduces indeterminism.

So the write conflicts aren’t avoided. Rather, they’re resolved in some arbitrary manner.

While such resolution is sufficient to respect the declarative definition, the arbitrariness does seem… unsatisfactory, inelegant.

But declarative definition has nothing to say on the elegance of state models. The job of finding `collaborative` state models that effectively support concurrent writers is ultimately up to language and framework designers.

Instability from Idempotence

Consider two independent declarations: `next = curr + 2` and `next = curr + 5`. Additions do commute, so if we applied both the result would be curr+7. So it seems okay to apply both demands in an arbitrary order. This would be even more `obvious` if operations on state were restricted to additions, so we’re just looking at declarations of the form `+2` and `+5` (once we’ve isolated it to a particular state resource).

Since the writers are independent, they do not know exactly what the other writers are doing. The writer of the `+2` declaration might observe that state is actually increasing 7 points each round, but does not know why. There are many possible scenarios resulting in that observable outcome: {+2,+5} or {+2,+1,+4}, or {+2,+3,+4,-2}. Or even {+2,+2,+5} due to idempotence (which eliminates duplicates).

Now for a change: a writer of a +2 declaration changes it to a +3 declaration. What happens? Well, depending on the initial scenario, the increment per round might change by -2, 0, +1, or +3.

Due to circumstances outside local control, a small change in the declaration of one writer might cause a wide variety of changes in state. For developers, day to day, this is not fun. It is chaotic, unstable, frustrating.

So, how do we achieve stability?

Well, the problem was caused because the declarations were directly additive. +1 and +3 added to +4. +2 and +2 added to +2, due to idempotence. What we need is a state model that becomes more stable as the concurrent writers are closer to agreement – additive won’t work.

We could try averaging the values. But consider a scenario such as {+1,+1,+1,+5}. We want this to `average` to +2, but it would actually average to +3 since the +1 only counts once (due to idempotence). So, while this is better, it is insufficient to avoid the problem.

It turns out that we simply cannot treat writers the same as declarations.

When we need to count writers – e.g. to model a vote, or a set of independent forces – then we must make it more explicit by providing an identifier for the writer as part of each declaration. For example, instead of {+2,+2,+5} the scenario becomes {(w1,+2),(w2,+2),(w3,+5)} where w1, w2, and w3 are values (perhaps URLs or GUIDs) identifying distinct writers.

Otherwise, we can use simple mechanisms that do not depend on the number of writers. For example, I might favor simply selecting the smallest value, so {+2,+2,+5} reduces to +2.

Again, this sort of arbitrary choice seems inelegant and unsatisfactory. But if we want more elegant options, we’ll need state models where the declarations contributing to the future state are actually idempotent, so that we may combine them easily.

Exclusive Writer, Continuous Time

The discrete time assumption simplified many things – state is constant in each `instant` and there is a clear `next` instant that serves as the trigger for updates. Continuous time introduces new possibilities, such as continuous varying state (useful for physics simulations). But it also introduces many challenges.

Discrete State with Continuous Time

While discrete-varying state models can be used with continuous time, the lack of a `next` instant (or, rather, the infinite abundance of `next` instants) hinders certain expression.

Consider a declaration such as `next = curr + 2`, held for even one millisecond. What would that mean? Do we increment once? Do we increment a billion times because there are a billion picoseconds in a millisecond? Do we increment infinity times? None of those options makes sense.

Without use of `next`, we can still use discrete state in continuous time. However, the writer must describe a fixpoint computation. Conceptually, the declaration is applied an infinite number of times before even one picosecond has passed, and so we must have fixpoint convergence. (Note that idempotent updates qualify trivially as fixpoints.)

We can model many things with fixpoint state models. But it is difficult to prove that the fixpoint of arbitrary functions will converge. And the lack of expressiveness is frustrating. And design patterns to work around problems (i.e. using delayed writes to incrementally modify state) tend to be onerous and unstable.

Fortunately, there is an idiom to regain full expressiveness. Instead of declaring `next = curr + 2`, I might declare `next = curr + 2 with cooldown of 2ms`. The cooldown would prevent the operation from applying an infinite number of times, and essentially specifies its own concept of discrete time.

The `cooldown` concept leaves a lot to be desired. What if the write declaration changes and the state is still in cooldown? It seems we might easily miss important writes. But there are other concepts similar to cooldown that achieve the same purpose. I’ve explored a time debt approach and timeouts.

Continuous State

Continuous-varying state can potentially have a different value at every instant in time. This can be useful for physics simulations; for example, we could precisely model the path of a bullet. But there are a number of new challenges here.

I have only idly explored this space, as it is a mere curio to my current visions.

But I have determined that there aren’t many simplistic continuous models that actually seem useful. For example, I had considered representing state in terms of an integral of forces, but every practical use-case I’ve developed calls instead for a system of differential equations (to account for friction, air resistance, etc.).

One model I have found reasonably useful is a continuous follow-the-leader or dangling-carrot behavior, suitable for simple continuous animations (e.g. for a user interface, or robotic motion). I’ve been wondering if such a model might be generalized in terms of Bezier control points.

Concurrent Writers in Continuous Time

Concurrent writers again introduce the challenge of combining writes, resolving write conflicts, and ensuring stability.

For discrete-varying state, combining writes makes it even more difficult to validate whether a fixpoint function will converge. Also, while the time-debt idiom seems to work fairly generically to support multiple writers, it took me a while to discover that pattern.

For continuous-varying state, we can expect continuous-varying declarations to control it (essential, for composition). The problem of combining write declarations and making useful resolutions means we’ll start to encounter challenges of detecting zero-crossings – i.e. find the time that signal A becomes less than signal B. Unless our continuous signals are of an analytic nature, detection of zero crossings will generally be imprecise and expensive – i.e. implemented by sampling near suspected collision points.

RDP and Declarative State

RDP falls into that last quadrant. The time model is continuous. Since I cannot create state in RDP, I assume that other writers can also `discover` it, and I also lack freedom to provide an ad-hoc combining function. I must instead `discover` state with an appropriate combiner function already built-in.

The inability for normal developers to introduce ad-hoc combiner functions means I must invent a couple `standard` state models for RDP – that RDP languages can be assumed to support in the same sense that imperative languages are assumed to support mutable references.

I’ve written about a few of my efforts and ideas – reactive state transition, animated reactive state transition, and animated reactive term rewriting, among others.

One decent declarative state model that is so simple it is almost beneath mention is simply mutating a set. This is a common state model for temporal logic approaches (such as Berkeley’s Dedalus and BLOOM), and in practice it is used very much like a tuple space or an SQL table. It works as follows:

  • State is a set of values, initially empty
  • A declaration can add a value
  • A declaration can delete a value
  • In case of conflict, delete wins
  • Update is modeled as add + delete
  • Idle behavior is to maintain the current set.

Since add and delete operations are idempotent, and commutative with add and delete on other values, there are very few conflicts. The ONLY conflict is adding and deleting the same value. State of an application can be divided across many such spaces.

Unfortunately, this state model isn’t very effective for a broad class of problems – in particular, incremental updates of values, such as a editing a textbox. And there are no obvious and simple tweaks to fix that problem without causing other problems.

More practical for a broad spectrum of applications is the animated reactive term rewrite state model. It is a most promising `de-facto` option for RDP at the moment – unless I figure out how to make RST blazing fast and achieve my vision of model reactive databases with reactive indexes and such.

Conclusions

This article is not about the utility of declarative state. I speak only about what it means in practice, and how it compares to imperative state. I would like readers to take two things away from this article:

  1. how declarative state can be achieved, and the relevant constraints (inflexibility) compared to imperative state models
  2. that resolving concurrent write conflicts is rather arbitrary regardless of the imperative vs. declarative distinction, it just expresses in different ways (e.g. last update wins vs. arbitrary resolution rule)
Posted in Language Design, Reactive Demand Programming, State | 3 Comments

Defining `Declarative`

I’ve seen many definitions for `declarative programming`. The term `declarative` is often contested because it has connotations and market buzz surrounding it – i.e. there is some advantage to saying your product is `declarative` even if that means stretching the term. I freely admit that I might be guilty of this, too.

My understanding of `declarative` is that it describes a synergistic intersection of several useful properties. Consequently, I can place languages on a lattice for just how declarative they happen to be. This article explains how I reach the definition I favor.

I start with the informal notion of `declaration`:

declaration: an explicit (written or oral) statement or indication of fact, opinion, or belief.

I believe a declarative programming language should enable expressing a program in terms of declarations – not just one big declaration, but a bunch of smaller statements each indicating a fact, opinion, or belief about the program’s behavior. For programming, of course, I’m primarily interested in written declaration.

How am I to recognize that an expression is a declaration? Well, I compare to some ideal properties of declarations.

  1. Explicitly stating the same fact, opinion, or belief twice in a given context doesn’t affect its meaning or truth.
  2. Source, location, and origin of a fact, opinion, or belief doesn’t affect its meaning or truth. Though, it is sometimes useful to track origin as part of the fact (e.g. `David believes that declarative is …`).
  3. Facts, opinions, and beliefs have a `continuous` nature – they hold over time. We can express statements about past events (e.g. `the ball went through the hoop`) but even such statements are true `over time`.
  4. Facts, opinions, and beliefs will name, describe, and relate objects in an implicit environment or system (e.g. `the ball`, `the hoop`). Thus, the meaning of a declaration is actually a function of its environment.

Formalizing these concepts gives me the properties I now associate with `declarative` expression:

  • idempotent: declarative expressions can be duplicated, and duplicate expressions can be eliminated, without affecting behavior of the program.
  • commutative: declarative expressions can be reordered or reorganized without changing their meaning.
  • concurrent: declarative expressions naturally hold at overlapping times. Declarative systems are implicitly concurrent in the sense of separate expressions holding simultaneously.
  • reactive: declarative expressions hold over time and have meaning relative to their implicit environment. If the environment changes over time, so will the active meaning of the declarative expression. [upd. 1/13]

Some people say that declarative is about eliminating `control flow`, but I understand that as a natural consequence of commutativity and concurrency – and to only be skin deep: control flow isn’t implicit in the syntax. In my experience, declarative programming is excellent for modeling ad-hoc control flows and work flows in terms of observations and declared rules of action. I point to Berkeley’s Dedalus paper as offering several good examples.

Some people argue that declarative is about `referential transparency` or `purity`, but I reject this. Declarations don’t have a `value` so much as they have a meaning relative to their implicit environment. However, idempotence and commutativity together do allow most of the same abstraction and refactoring advantages achieved by referentially transparent code.

Some people claim that declarative is about `what not how`, but there isn’t any fundamental reason we cannot declare hows instead of whats. To be honest, I’ve never understood the `what not how` argument – what vs. how is a difference in perspective, whether you’re looking up or down the ladder of abstraction. A clever man might try `why not how`. But, again, this seems orthogonal to the notion of declaration – the decision to declare motivation, intention, or mechanism is just stages of abstraction.

There is simplicity and elegance in having `declarative all the way down` or `everything is declarative`, including IO. The input to such a declarative program is a set of declarations – essentially, another declarative program. And so is the output. That implies a declarative representation of implicit environment and computed demands, of observation and influence. Such a design would be useful for abstraction, staging, typing, metaprogramming, embedding and extending languages. Declarations of what, why, and how are mixed in a single program, together with declarations to combine and compile them.

Of course, the benefits of `everything is a` designs are hardly unique to declarative. Similar benefits have been argued for functional, OOP, procedural.

While I don’t list `everything is a` as a criterion for declarative, it would be a fair point in a comparison between languages if arguing which is `more declarative`. Though, I’m not sure the argument itself would have a point. While `more declarative` is better if all else is equal, all else is probably not equal.

There are many programming models that I count as highly declarative under my definition: constraint, logic, temporal constraint, temporal logic, generative grammars, relational, various forms of dataflow, synchronous reactive, event calculus, functional reactive, and reactive demand. These models vary considerably in other characteristics – composition, security, resource management, completeness, staging, etc.. But they are all very declarative.

On the other hand, there are many systems that are ostensively `declarative` – such as HTML or pure functional programming – that I consider to be relatively weak examples of declarative programming (e.g. due to having poor idempotence or commutativity of expressions). Sure, HTML is more declarative than PostScript. And pure functional is somewhat more declarative than procedural (and certainly more cleanly staged). But they aren’t very high on the declarative lattice.

I posit that functional programming is `declarative` primarily to the extent one is functionally constructing a declarative program.

Posted in Glossary, Language Design | 7 Comments

Social Aspects of PL Design

Software development is ultimately a social experience. And I’m not just talking about teams of coders. Even sharing or using code via library, or services via API, is a social experience involving communication and relationships between humans.

Consequently, there exists a vast array of social concerns surrounding the development experience: discovery, configuration and adaptation, maintenance and upgrade, safety and security, packaging and distribution, trust, reputation, blame, spam, licensing, market integration. If some of these aren’t obviously language-design concerns today, it is only because we have humans doing most of the footwork. Today, we human developers must purchase, upgrade, package, configure, track licenses, point fingers, and slog through spam. Greater automation and finer granularity will inevitably require a more formal and explicit representation of social policy.

Social concerns should be addressed in programming language designs. My reasoning is thus: Most code and services I use come from strangers. As years pass, I become a stranger even to my own code. An effective social model for programming should lower the barriers and improve confidence in using code and services developed by strangers. I posit that addressing social concerns does more for usability and productivity than addressing `cognitive concerns` (of the individual developer) because social elements will pay for itself manyfold due to network effects. (That said, I do not believe there are many conflicts between social and cognitive design considerations.)

A language designer should consider such cases as how the module system, type system, and namespace model influence the developer’s social experience. Is it easy and efficient to integrate code that uses a set with code that uses an array? Can developers effectively compose services that use heterogeneous data models? Is it easy to take code written for GTK and reconfigure or adapt it to use QT? Is it easy to track old configurations for regression testing? How much effort is required to disentangle a useful function or subprogram for use in other projects? Is there any mechanism for cross-project refactoring? Will developers easily experience `dependency hell`?

Security lowers a social barrier, mitigating need for `trust` or vetting code in advance. An expressive basis for security (such as object capability model) can also express and enforce interesting social policies within the language and thereby address many social concerns (e.g. Horton protocol).

It might help to have a vision for the social development environment. I have an idea in my head of a Wiki IDE concept as the basis for code sharing and discovery (and also a live projects on the cloud), with a DVCS-like mechanism to move code between wikis.

Unfortunately, there seems to be a lack of control-tested and empirically validated social models for addressing PL design in particular. I’ve not read of any `social dimensions of notation` to mirror Thomas Green’s `cognitive dimensions of notation`.

My own social design efforts are consequently `principled` in nature. Principled design is by no means scientific. It is an even mix of art, reasoning, intuition, and hypothesis, seasoned with a dash of idealism. But it works well in principle :) . Here are some resources from which I draw guiding principles:

I don’t expect anyone to share my preferences for principles. But I would like to see more arguments considering the social aspects of PL design.

Posted in Language Design, Modularity, Open Systems Programming, Security | 1 Comment

Isomorphism is not Enough

Too often, I see arguments of the general form: “I can transform feature X into Y, and the reverse, while maintaining relevant properties. I have an isomorphism. Therefore, I don’t see the difference between X and Y.” For the arguments I’m likely to participate in, X and Y tend to be languages, features, abstractions, protocols, frameworks, or design patterns.

The problem with this line of argument: it assumes transformation is trivial in context, i.e. from the perspective of a language’s user.

For expressive equivalence, we need continuity and locality in addition to isomorphism. Assuming Px is a program from language X, and Py is a program from Y resulting from the translating Px:

  • continuity: every small change in a Px results in a small change in Py. Continuity can be expressed in terms of an asymptotic epsilon bound (as the change in Px approaches zero size, the change in Py also approaches zero).
  • locality: a bounded change in Px results in a change with a corresponding boundary in Py. Essentially, the transform respects modularity and encapsulation.

The mathematical terms to learn are homeomorphism and diffeomorphism.

Isomorphism is not enough. But it is a good place to start.

Posted in Language Design | 4 Comments

Animated Reactive Term Rewriting

Rewrite systems are a general approach for modeling discrete computations and logics. A set of rewrite rules are applied to a term (or graph, or string, etc.) repeatedly until no further rewrites are possible. Rewriting is distinct from functional programming in that the `fixpoint` reduction (repeated application) is implicit, whereas functional expression of the same model would require explicit recursion. Modeling functional programming in a term rewrite system would involve expressing `apply f x` as a term to be reduced, which allows developers to control the number of applications.

I am interested in term rewrite systems (TRS) as a potential discrete state model for reactive demand programming (RDP). Term rewriting is a generalization of state transition systems, which I have already explored as useful state models [1][2]. I mean `animated` and `reactive` in the same sense (and for similar reasons) that I desire them for animated, reactive state transition.

  • reactive: the set of rewrite rules may change over time based on influence from the external environment. Rewriting is logically applied at every instant, thus changes in rewrite rules apply immediately. State effectively `reacts` to conditions.
  • animated: it is possible to model time-varying state with a fixed set of rewrite rules, making it easy to `control` or animate external systems over time even when external conditions are stable. A simple example would be incrementing a number once a second.

Term rewriting is very expressive compared to state transition, i.e. there is a huge difference in what you can express with a finite set of rewrite rules vs. a finite set of transition rules. In context of RDP, where there are at most a finite set of rules at any given instant, this difference in expressiveness corresponds very directly to a difference in computational power and flexibility. However, such power has a cost: rewriting can easily diverge even with a finite number of rules (whereas state transition can only cycle), and it is much more difficult to control computation costs.

Like state transition, term rewriting may be indeterministic. I will assume some (possibly arbitrary or heuristic, based on structure) strict ordering can be applied to the rewrite rules because determinism is a property I desire for RDP.

A basic reactive term rewriting` state model (without animation) would have the following elements:

  • One term register carries state as a complex term.
  • A capability to influence a term register by imposing a rewrite rules in the demand value: (s Rules) ~> (s ()), for some discrete-varying signal type `s`. At any given instant, a finite number of rewrite rules are influencing the term register.
  • A capability to read the term register: (s ()) ~> (s Term).
  • A discovery mechanism, allowing developers to pretend term registers are eternal (and orthogonally persistent). This is necessary since there is nothing new in RDP. Also, ability to reset the register.

Whenever observed, the term is always stable. Rewrite rules may change from one instant to the next, and the term is instantaneously reduced to a new stable state. Change in the term only occurs as a consequence of change in the rewrite rules.

I was extremely unsatisfied with this basic Reactive Term Rewriting model. First, there are the unpredictable costs and divergence concerns. Second, it seems too difficult to model useful behaviors, such as an integer that increments periodically – i.e. you cannot ever express a rule of the form `X => X+1` since that would diverge, but expressing rules of the form (3 => 4) (and switching to a new rule when conditions change) does not leverage the full extent of term rewriting.

Animated reactive state transition added a couple extra registers to expire and change state, allowing state to change over time even when the influence rules are constant. Naturally, I considered animating reactive state transition as part of that. A simple approach would be to use the same expiration technique: a rewrite rule may have two stages, allowing a further rewrite after some time has passed if no other rule applied in between. But that still has the problems with divergence, unpredictable costs, etc.

More recently, I’ve found a model that seems more promising for my goals – based on modeling costs directly. In this model:

  • Each term register is accompanied by an `time debt` register, whose range is >= 0.0.
  • Each rewrite rule is labeled with two positive, real-valued attributes: maximum time debt, and time cost.
  • When a rule is applied, the time cost is added to the time debt register.
  • A rule may not be applied unless the time debt starts less than or equal to maximum specified for that rule.
  • Rules are always applied favoring those of the highest `maximum time debt`. This ensures that rule selection is independent of the initial time debt, and thus makes the model much easier to predict. (One might call this the `path of least resistance` rule.)
  • Time debt reduces toward 0.0 at a constant, continuous rate of 1.0 per second.

Effectively, developers model how long each rewrite rule takes (logically). Since debt is allowed to build, it is possible to model bursty or cascading computations. Animated behavior is easy. Consider a term rewrite model with just one rule and an enumerable term. While the rule is {rule:X => X+1, time-cost:1.0, time-debt-max:0.0}, the term would simply increment once per second. Even though the rewrite rule is active at every instant, the time debt is most of the time too high for the rule to apply. Effectively, divergence becomes animation.

There is no absolute cap for time debt, but at any given instant the maximum debt the model can reach is a simple function of the rewrite rules. Since developers can easily cap the amount of computation that occurs in any given period, this animated reactive term rewriting model is suitable for real-time applications.

Posted in Concurrency, Language Design, Reactive Demand Programming, State | 4 Comments

Disjoin: Decision at a Distance

The disjoin and conjoin behaviors in RDP are the two primitives that connect sums and products:

disjoin :: (x :&: (y :|: z)) ~> ((x :&: y) :|: (x :&: z))
conjoin :: ((x :&: y) :|: (x :&: z)) ~> (x :&: (y :|: z))

The disjoin behavior means that we can take a pipeline `x` and split it into two parallel paths based on state of another pipeline (y or z). Conjoin is dual to disjoin, and allows a partial merge of a pipeline in order to execute the same behavior on both paths.

A little background: the :&: connective indicates a `product` – i.e. that two behaviors or pipelines are active in parallel (equal durations) and may be zipped together into tuples. The :|: connective indicates a `sum` – i.e. only one pipeline is active at any given instant (partitioned durations) based on some earlier decision, and the behaviors might later be merged into one contiguous signal. Use of :&: and :|: represent asynchronous values in that they come from separate pipelines and may have different latencies. `Asynchronous` doesn’t mean `distributed`, but readily allows for it. The zip and merge operators will synchronize and combine these asynchronous values into a single pipeline. Products and sums are duals, with zip opposite split and merge opposite dup:

zip :: (Signal s) => (s x :&: s y) ~> s (x,y)
split :: (SigSplit s) => s (Either x y) ~> (s x :|: s y)
merge :: (x :|: x) ~> x
dup :: x ~> (x :&: x)

To express equivalent of if/then/else expression is a three step process involving split (lifting boolean into control flow), disjoin (makes environment observable within choice), and merge (which `closes` the if/then/else branches).

type EBool = Either () ()
ifThenElse :: (SigSplit s) => (env ~> rhs) -> (env ~> rhs)
           -> (env :&: (s EBool)) ~> rhs)
ifThenElse onTrue onFalse =
  second split >>> -- lift boolean into local control flow
  disjoin >>>  -- distribute decision to split environment
  (bfst +++ bfst) >>> -- eliminate vestigial boolean
  (onTrue +++ onFalse) >>> -- operations on environment
  merge -- combine results of true and false paths

Depending on `rhs`, the above could be a statement or expression.

I think this model of ifThenElse is interesting, in that it exposes some inefficiencies and compositional weaknesses of if/then/else. Every decision involves synchronization at three places: on opening the decision (split), on accessing the environment (disjoin), and on closing the decision (merge). For rich expression and composition, developers should favor naked (onTrue +++ onFalse), working with abstract decisions while delaying commitment to their source. The idea is to avoid local decisions and avoid shared environments, push these things closer to the edges of the application.

These conclusions aren’t unique to RDP. Even down at the CPU level, one might recognize branch prediction concerns as a workaround for the synchronization overheads inherent to if/then/else. There are proposed alternatives even for imperative and functional programming – cf. Multi-return Function Call by Olin Shivers and David Fisher [LtU node].

Implementation

In practical terms, `disjoin` on (x :&: (y :|: z)) can be implemented by a multi-step process:

  1. Duplicate the `x` behavior for the LHS and RHS of the :|: connective.
  2. Mask the `x` behavior on the LHS with the activity of `y`, and mask the `x` behavior on the RHS with the activity of `z`.
  3. To generate the `control` signals for y and z, I need something like a `drop` behavior (drop :: forall x . x ~> u ()).

Steps 2 and 3 together imply that there is a global `unit` signal type – `u ()` – for each RDP behavior, such that every signal can be masked by u (), and every signal can reduce to u (). A global unit type implies a global time model, but still allows for a variety of signal types (i.e. continuous time-varying vs. discrete time-varying signals). Unfortunately, representing existence of this unit type in Haskell is somewhat painful (at least if I want to keep generality, it means enforcing a typeclass on every signal that becomes part of the behavior), and I’m still working on a `comfortable` way to do so, including tweaks to the signals model I described in an earlier article.

Static Sums and Dynamic Behaviors in Reactive Models

I have studied many reactive programming models, in many variations. First-class support for conditional expressions at the reactive layer is a very rare feature. It isn’t really a necessary feature for anything: dynamic behaviors (generate behavior based on observed conditions or history) and use of decisions as part of the value layer (Either a b) can adequately cover any cases where one might use a static set of choices in the reactive model.

While explaining RDP in the past, I have a few times been presented with a question: why bother with sums when you have dynamic behavior?

First, sums help complete the model, makes it less dependent upon the underlying `value` layer. Indeed, I could lift the notion of `boolean` values into my model (RBool s = s () :|: s ()), and conceptually (though not pragmatically) could model 32-bit integers as products of 32 RBool choices. I don’t really need any other values than asynchronous ones, so long as I provide some behaviors for `+` and `*` and the like. Sums are, as noted above, the dual to products.

I could wax on about how sums and dynamic behaviors serve nearly orthogonal roles, and about the virtues of duality (sum is yin to product yang), and present analogies to using `case` statements when you could use Church booleans in Haskell.

But here’s the real reason: performance and stability.

  • Static sum behaviors provide stability, allowing a single behavior to serve under many common conditions, and thus survive longer in context without need to swap out. Stable behaviors avoid set-up and tear-down costs, and support many optimizations (memoization, caching, partial evaluation or tracing JIT) more effectively than volatile behaviors.
  • Dynamic behavior provides specialization, allowing an application to make only relevant choices based on the observable context and accessible resources. Basically, dynamic behaviors implies dynamic compilation and linking. Developers can pay for a closer approximation to what they use.
  • Dead-code elimination does not easily cross an `eval` expression since it is difficult to anticipate how much of the environment an evaluated expression will observe or influence. But if `sum` types are inputs or outputs from a dynamic behavior, one can at least avoid synchronization and possibly eliminate dynamic code based on dead sinks (i.e. an output that is dropped by the static context).

I envision developers relying on these combined properties, i.e. building towers of reactive staged interpreters and leveraging push-based reactive semantics to more easily identify points of specialization (any interface of slow-changing code with fast-changing code or data).

Posted in Concurrency, Language Design, Reactive Demand Programming | 2 Comments

Declarative, Reactive Command Line

A good command line interface allows its users to pack a lot of useful work into strings the size of twitter messages. There is no room for boiler-plate, no patience even for parentheses. Identifiers must be short. Parameters must have good defaults. Failure should usually be obvious and safe, allowing users to refine their commands and try again.

A common property of most command-line interfaces today: they are of an imperative, procedural nature. It is difficult to declare the answer before prompted, which ties the model to time. The querying app must often wait for a prompt to be answered – i.e. interactive rather than reactive. It is difficult to modify an answer to an earlier query (e.g. “What is your name?”). Queries on the system are volatile and must be polled to remain up-to-date. System behavior is sensitive to order of expression. A command expressed twice will typically result in two distinct behavior instances.

A declarative, reactive command line would have the following properties:

  • Behavior of system is declarative. That is, users declare how a system behaves with a fair level of idempotence (declaring a behavior twice doesn’t result in twice the behavior) and commutativity (order of declaration not so critical).
  • Behavior of system is reactive. Any earlier declaration of behavior may be revoked or modified.
  • Time is generally implicit, to keep lines short. It is possible to declare many behaviors as occurring at the same time, i.e. a batch update.
  • Relationship between user and application may be maintained through command line. Users may anticipate answers to prompts, and declare answers in advance. Users may change declared answer at any time.
  • Prompts to user are inherently reactive, not interactive, meaning the app does not implicitly wait. If app must wait for valid answer, this must be modeled explicitly in the app – if(bad_answer) then nothing else doCoolStuff.
  • Most output to user is also reactive and subject to change over time. I.e. “Hello, World!” might later change to “Hello, Console!”. This could be expressed quite naturally with graphical output or curses. If streaming, a less natural mechanism might be required – but it could at least be symmetric to how the user modifies declarations.

I envision such a system working a lot like a tuple space: CLI actions add or remove declarations from the space. A simple case might be declaration like `+lights` turning on the house lights, and `-lights` turning them back off. At least one service (the shell) is observing the space and searching for other services to fulfill each request, but I think it might be more interesting to run this in reverse: allow any number of observers to `interpret` the declarations when deciding their behavior, and attach the console to different tuple spaces as needed.

Prompts from the application are given a unique structural identity, e.g. of the form “com.example.hello.user_name”, and all prompts on the ID will return the same value, which can be changed over time. Prompts can carry a lot of extra meta-data, including query strings.

Properties such as filters, queries, histories, suggestions and tab-completion, macros, aliases, topics or switching, CSCW, and persistence would help modernize the shell. Structured text data rather than plain strings could improve filters and queries a great deal, and support level of detail. JSON may be a bit too verbose, but something like ML9 would allow adding just a little structure where necessary and implicitly annotating each declaration with metadata.

Posted in Open Systems Programming, User Interface | 9 Comments