Why Not Events

I’ve spent a lot of time arguing against event systems. I don’t argue because they’re bad; compared to batch processing or shared-state imperative concurrency, event systems (actors, vats, channels, etc.) solve many problems and are relatively easy to reason about. Rather, I argue against event systems because we can do even better. But my writings are scattered across many forums.

This article shall provide a place to consolidate my arguments against event-based modeling and control of systems. It does not promote any specific alternative – I leave that to other articles.

By events, I include commands, messages, procedure calls, other conceptually `instantaneous` values in the system that independently effect or report changes in state. Consequently, I consider message passing systems to be event processing models. In general, I assume that events are processed by single threaded event loops without shared state, though there might be multiple such threads in an event driven system. I.e. I’m claiming a position even against sane event systems – like actors model, pi calculus, CSP, and E-language vats. (The event calculus is closer to temporal logic; most of my complaints do not apply to event calculus, despite the name.)

By their essential nature as instantaneous observations and effects, events introduce much `accidental complexity` in our systems. As with most systemic accidental complexity, this is not obvious until you’ve experienced something better. Here are examples of accidental complexity and the atrocities people inflict upon themselves with event systems:

  1. We very often want views of the current state of a system. Events represent or effect changes in state. By using events, we commit ourselves to using event accumulators – explicit state – for this common computational task.
  2. We very often want to combine events from multiple sensors and sources in order to represent composite events. This event fusion requires much explicit state.
  3. Often we wish to fuse views of overlapping subsystems, or overlapping perspectives. When views are event streams, there is a troublesome condition of implicit event replication: e.g. a single button press might be observed in both event streams, perhaps with translations, with no general means to recognize it as a common event that should only be processed once. Inability to effectively work with overlapping data-model abstractions severely hinders compositional reasoning and reuse of event streams.
  4. Events require much implicit state in their communication models – i.e. queues, events in transit. Each event carries a snapshot of the past into the future, with implicit time passing between send and receive.
  5. Events handle inconsistently for non-linear processing. In particular, when we split an event into two events, we must order them – introducing a notion of `time` (this event before that one) within what was a conceptually instantaneous action.
  6. State in event systems is fragile. A lost, reordered, or replicated event can affect (and potentially corrupt) the whole future of a program. This is true of both explicit and implicit state in event systems. Minor bugs in code (e.g. missing an event to release resources) or hiccups in scheduling or communication tend to propagate.
  7. Abstractions in event systems are fragile. Event fusion is highly sensitive to local, arbitrary ordering decisions for merging events that otherwise appear simultaneous. Even in a deterministic system, it is difficult for two observers to achieve consistent views of complex event streams without sharing actual implementation code. Consequently, event systems are difficult to reason about in the presence of open extension. In general, we cannot robustly compose views specified in event systems.
  8. Event ontologies are not compositional. Sending two events in most event systems lacks the same meaning or performance (atomicity, progress, efficiency, latency) as sending one composite event. Providing events from two sources is not the same as fusing events then sending the composite. Developers are faced with tidal pressures towards both larger events (for efficiency, atomic updates, data fusion) and smaller events (for simplicity, modularity). Ontologies grow in ad-hoc manners, hindering development of generic composition operators. This issue can be mitigated by code-distribution abstractions (batching, promise pipelining, mobile agents, scripting).
  9. Event systems work harder. Events are the “changes in state” that someone else (e.g. a framework developer) considered important enough to report. And that someone else often lacks the foresight to accommodate our needs. We invariably need a another perspective. For event systems, achieving another perspective requires three complex tasks: fusing events into state, detecting patterns in prior state relative to the current state after each event, generate a new series of events that we consider important. By comparison, state transform models and filters (functions, queries) are relatively simple to express, reason about, and automatically optimize.
  10. Event systems lack generic resilience. Developers have built patterns for resilient event systems – timeouts, retries, watchdogs, command patterns. Unfortunately, these patterns require careful attention to the specific events model – i.e. where retries are safe, where losing an event is safe, which subsystems can be restarted. Many of these recovery models are not compositional – e.g. timeouts are not compositional because we need to understand the timeouts of each subsystem. Many are non-deterministic and work poorly if replicated across views. By comparison, state models can generically achieve simple resilience properties like eventual consistency and snapshot consistency. Often, developers in event systems will eventually reinvent a more declarative, RESTful approach – but typically in a non-composable, non-reentrant, glitchy, lossy, inefficient, buggy, high-overhead manner (like observer patterns).

The complexity of event systems and eventful state models is indexed on permutations of expressions, whereas the complexity of declarative systems is indexed on combinations of expressions. Event systems, in general, are exponentially more complex than declarative systems to achieve the same goal. And most of that complexity is non-essential, accidental. All but the most trivial of event systems quickly grow so complex and difficult to reason about that there are no obvious bugs or inefficiencies.

The usual knee-jerk response to eliminating events is:

What about button presses?!

To which my knee-jerk answer is: this is an almost stupidly trivial example to defend event systems, and it is trivial to model button presses – or any other real-world event (ignoring eventful abstractions you invent for yourself after assuming an eventful paradigm) – in terms of live state. A button-press is observable as an up state, followed by a down state, followed by an up state – each state with a positive, computable duration. (And by computable I mean a subset of the rational subset of real numbers. Even one picosecond is okay.) Ah, but I’ve learned that playing roshambo with knee-jerk responses is painful and unsatisfactory for all parties.

It seems, after wading through confusion and miscommunication, the actual concern is that: “but querying state with events is lossy! I’ll lose button presses! and it’s really frustrating to lose button presses, e.g. when working with a joystick.” When said like that, it becomes obvious that the problem is a bad assumption: state can only be queried by events.

With an “eventless” model, we cannot use events to query for state! (And I mean that in the trivial sense; to do so would be a contradiction in terms.) We must instead use state to query for state. State is continuous. Our queries are continuous. You might understand such queries in terms of “subscriptions”. The FRP and synchronous reactive communities understand them in terms of signals. The database community might grok them in terms of streaming temporal data. In any case, if our queries are properly continuous, we will not miss any intermediate states. We’ll observe button presses even if they last only one picosecond, no polling.

The representation of buttons as signal sources (representing current state) offers several advantages over eventful buttons. In particular, it is easy to combine observations of multiple buttons, or combine observation of a button with some other ad-hoc condition (when Foo and Button Pressed ...). Such expressions on multiple sources are robust against order of expression, and semantically stateless, making it very easy to achieve consistency between observers. It is easy to combine overlapping views that share the same button, with resilient and compositional safety net for consistency (snapshot consistency, or eventual consistency, depending on how much intermediate processing is contributing to the views). A late arriver observing the system can view the state for all the buttons, not just the ones most recently manipulated, without explicit ad-hoc support. Eventless observations on state are commutative, idempotent, concurrent, continuous, and declarative.

When we do need events, we can recognize them as differences in state. This is valuable because we often seek complex ad-hoc events, just as we seek complex and ad-hoc conditions. (Usefully, some models may allow stateless event detection – i.e. by comparing present and future so we don’t depend on any estimate of the past.)

Of course, we do need to use declarative state models. The imperative state models with which you are most familiar are generally designed for event-driven manipulation and observation (mutual exclusion, isolation, transactions). Declarative state models must handle concurrent, continuous influence. A button press might influence state for one picosecond, but that’s potentially enough to change it.

About these ads
This entry was posted in Concurrency, Language Design, Open Systems Programming, State. Bookmark the permalink.

16 Responses to Why Not Events

  1. Geoff H says:

    This only makes sense for the base system that gathers events, not sub-systems that receive events. Polling times are not going to be the same in all applications, and an application that polls slower that state has changed will indeed miss keystrokes (KB characters are much more annoying than the joystick example given, as it causes typos).

    Events are sent in a reliable and ordered manner, which have been collected from a fast poller, and slow polling applications (due to their workload) can still get reliable input, which they could not have if polling the raw state at their leisure.

    If you can guarantee fast polling, sure, it’s better than events. If you can’t, then events give the reliability of the proper stream of input.

    • dmbarbour says:

      To push representational state (like HTTP `PUT`) would be quite acceptable. Pushing state has many nice properties that events do not: idempotence, robustness to lost updates, eventual consistency. If you add time-tags to the state, then pushed state also becomes commutative and monotonic.

      The problem with events is not that they are ‘pushed’, but rather that they represent a change in implicit content. Even if you assume a reliable event stream from “the base system that gathers events”), you will encounter the issues I described: accidental complexity, non-essential state, and fragility. The robustness issues will exhibit due to concurrency, race conditions, state management, dynamic observers, and buggy code. Complexity is bugs.

      A state stream, such as a signal, will offer greater reliability and robustness than an event stream. There is a cost: you will communicate and compute larger chunks of data than an event stream would require. Event stream processing is a lot more accessible to hand-optimization. But this cost is not on the reliability axis.

      NOTE: I do not propose that applications poll for state. Polling is an eventful idiom: you can’t even talk about polling without talking about “poll” events.

  2. bmeph says:

    I find your ideas intriguing and would like to subscribe to your newsletter.

    Seriously, you should take some time to gather together all of your summary posts, and write a paper, or three.

  3. Paul Barrass says:

    Interesting article, and nice site. I found my way here through Code Project and found myself staying for a while. Nicely though out, and well considered, but I think the limitations (if you like) and historical development of our hardware alongside the kernels that allow us to interact with said hardware mean that the event model is pre-disposed to work whereas the declarative state model you mention, simply isn’t.
    Just as an example, from your own site on conflict resolution to multiple declarations within the declarative state model states the three options for resolution to be a) report conflict. perform idle / b) pre-configured arbitary win on declaration resolution and c) random.
    These things just don’t occur at a hardware level (there is an associated hardware cost with any of them that your level of abstraction fails to take into account), and I know a single write agent has been discussed, but unless you have a single agent acting as the kernel of all things including being a direct hook for application code, (which is undesirable if not impossible), you do not guarantee your state “tick” and you are in all kinds of trouble. Even with a single kernel agent, reading and writing to, and clearing of hardware registers is an eventful process, where mutex, transactions and isolation are demanded by physics rather than design. As we design and build our languages up from the hardware it’s no surprise to see that events are still hanging around so much. (I do agree with you that events are not just modern “events”; they are messages/interrupts, etc which are polled, trapped and sent/recieved, etc.)

    From personal experience in telecomms, a declarative state model was tried for mux’ing (multi-plexing) voice in the early seventies where the event driven model was considered to have too large an overhead, and although the state model remained, the controlling engine and network interface remained event driven as the hardware clocking functionality demanded it. I agree that this has introduced complexity where it is not necessary, but as long as the designers manage the complexity correctly, I do not personally see it as as much of an issue as you seem to. (As an aside, there is still some declarative state streams within the voice sphere, but they are only used as a lossy technique as eventually you have to process the down stream, which is an event driven process running off the back of a hardware clock with no guarantees you catch all state. No good for lossless demands, but with regards to that I can see game engines finding them very useful.)

    • dmbarbour says:

      Modeling traditional imperative state requires conflict resolution becomes somewhat arbitrary; I have also been developing declarative state models that lack this arbitrariness. Also, even imperative state models can be augmented to take hardware costs into account, cf. animated term rewriting. If I were to model a CPU, I would probably need to use an animated (aka timed) state model.

      But I find interesting that many hardware components seem to favor a more declarative programming model. GPUs are all about streaming data. Graphics pipelines are essentially pure functional transforms. The resulting output of the GPU number crunching pipeline is a time-varying frame – a signal. Control systems are often defined by wire diagrams with feedback loops representing state. DSPs, FPGAs, future memristors, and even hardware fabrication itself (3D printing?) all seem promising targets for declarative specification. User input is very effectively described by signals, and much artistic work and screen layout can be modeled readily with constraints. Similarly, while declarative is poor for modeling a single CPU, it is quite effective when we start to scale outwards – for managing code and data distribution across CPUs and networks. This has been realized many times, resulting in the CALM conjecture and development of Bloom.

      Unlike peripheral hardware, CPU hardware is more severely constrained in its development to support existing software, which is in turn written to the CPU hardware in a vicious cycle. Hardware has been trying to break out of that mold. You can see some of that with the pressure towards weaker memory models, pipelines, and so on. I remember, perhaps five years ago, reading of processing units developed for ultra-low-power computing and declarative composition via hooking CPU-local memories together (no bus, no central memory). Such devices will have difficulty penetrating the current markets. (The desire for low-power ubiquitous computing is, today, a significant pressure for research back into more analog hardware and computation models.) Rather than building up from hardware, I have favored the opposite direction: to develop declarative paradigms that are suitable for low-level programming and even as a basis for hardware. (My RDP would need to be augmented with linear types to control fan-in on resources, but is otherwise not unsuitable for hardware.) The primary requirement for being suitable for low-level programming is a locality constraint on computation: the model must address communication of information, and must ensure that computation only needs local information. Being designed to scale in multiple directions, RDP programs can compile efficiently even for traditional CPUs, avoid need for global GC, and are well disposed to support open distributed systems. (The potential for targeting GPGPUs is also high.)

      Anyhow, while hardware has certainly had historical impact on programming models, I do not believe it to be a significant issue for this article. Whether our software is ideally suitable for individual CPUs is just a performance issue – and one that can be effectively addressed (to within a few percent of optimal) by careful choice of declarative abstractions. In this ‘Why Not Events’ article, I think performance is a much lesser concern compared to complexity, bugs, robustness, and productivity. You say “as long as the designers manage the complexity correctly, I do not personally see it as as much of an issue”, but I think it is clear – from the proportion of software projects are over schedule, over budget, and still buggy – that it is difficult to manage complexity correctly, much less swiftly or painlessly. I once heard an excellent summary of complexity: complexity is bugs. I offer this for consideration: perhaps “managing complexity” (e.g. encapsulating it with OOP) is the wrong answer, and – as much as possible – we should instead address causes of unnecessary complexity and remove them. Eventful programming is a cause of much accidental complexity, and many bugs, no matter how you manage it.

  4. Pingback: fogus: The best things and stuff of 2012

  5. I’m curious about your thoughts on events outside of a distributed context. For example, the “Command Pattern” used in rich client applications such as Photoshop. I think that in a local context, a lot of the problems are irrelevant:

    1) The application’s “current state” is a natural event accumulator
    2&3) Even with multiple sources, like a keyboard and a mouse, it’s reasonable to impose a total ordering via an event loop because the total number of events are small and centrally located on a single node.
    4&9) Commands capture user intent, since they reify provided inputs. That implicit state is inherent to the user’s mental model of the application when issuing a command.
    5) All processing is linear
    6&10) Events can’t be lost, reordered, or replicated
    7A) Load is usually very light (since there is only one user generating events)
    7B&8) Open extension and composability are problems generally limited to plugin systems; most applications have a closed command vocabulary. Plugin systems are often best implemented as distributed systems. In such a case, systems can be layered: local models are updated via events/commands and then changes in state are communicated in a distributed fashion by the mechanisms you describe.

    I think that there are many parallels to the Clojure Agents vs Erlang Actors as described here:
    http://bc.tech.coop/blog/081201.html (search for “I chose not to use the Erlang-style actor model”). In the interest of being “as simple as possible, but no simpler”, it seems to me that events/commands are ideal for single-partition systems. However, you make an excellent case that events are too simple for distributed systems.

    What do you think?

    • dmbarbour says:

      As far as event patterns go, command pattern has a lot of nice properties (e.g. with respect to transactions, persistence, undo, replay after code edit, testing) and is (by far) one of the better options for organizing application code. But it still leaves much to be desired with respect to composition, extension, CSCW, heterogeneous views, or synchronized rendering of live data feeds. And events are not very effective for modeling or processing an array of common HCI devices (e.g. joystick, gestures, voice, video).

      Regarding your numbered points, I agree with most of what you say. However, your point `4&9` is not convincing. It is not clear that implicit state of the application model in any way reflects the user state in making a decision. If we ever do wish to present heterogeneous views, then ‘commands’ to one view will generally be transformed to multiple commands on the canonical model, and conversely updates to the canonical model will generally transform to multiple updates in the view. Thus issue 9 is well intact. Capturing intent means recording a semantic (rather than structural, layout-dependent) representation for captured input. This is easily achieved in any sane programming model, but does not address any of the issues with events.

      I would not call the problems with events ‘irrelevant’ even in context of local, rich client applications. I think they impose opportunity costs. But opportunity costs are notoriously difficult to evaluate. I believe it is quite feasible to model rich client applications without events, and that (after the tooling for doing so matures) it will ultimately be the better option. I’m already contemplating some interesting, concrete designs for an eventless reactive application server, which I aim to build above Sirea this year. Meanwhile, we should work with the tools available to us (as working with them is better than working around or against them). In that context, I approve of designing apps around the command pattern.

      • > But it still leaves much to be desired with respect to composition, extension, CSCW, heterogeneous views, or synchronized rendering of live data feeds.

        There are definitely short comings, but I wonder if it’s something that is best solved with layering. Considering collaboration, one approach is Operational Transform, but that gets you back into the territory of the concerns from your post. So, if it were me, I’d be exploring how to use declarative state there, rather than interfering with the desirable mental model of an undo stack.

        > It is not clear that implicit state of the application model in any way reflects the user state in making a decision.

        One example, from Photoshop [1], is the Resize Image dialog. The dialog supports both absolute and relative resizing, as well as the option to constrain image proportions. The expectation that users have, proven via UX testing, is that the most recently edited fields are preserved while other fields are updated from constraints. When the OK button is pressed, an image resize command is executed. Commands are captured for replay. You can name and save command scripts, but even in the simplest case, there is a commonly used “do-again” command. If you did an absolute resize, then do-again is a no-op. But if you did a relative resize, say to 50% width, then do-again will produce an image 25% of the original width. Implicit state is inherent to communication from the user in both synthesizing the command, as well as the meaning of “do _that_ again”.

        > I believe it is quite feasible to model rich client applications without events, and that (after the tooling for doing so matures) it will ultimately be the better option
        > [...snip...]
        In that context, I approve of designing apps around the command pattern.

        I’m very interested to see what develops here. I think that there are some more fundamental challenges first, which is why tooling for UI development is generally pretty bad. That is an area I’m actively working on. My plan is to embrace the command pattern as “the best we’ve got for now”, so that I can focus on solving other problems and the user experience of the tooling.

        [1] http://dl.acm.org/citation.cfm?id=1449913.1449927

  6. shelby says:

    You don’t really want to eliminate events, e.g. your button state change is still an event, rather you want to eliminate callbacks, because they can’t be composed without local state and unavoidable ordering (since each callback and local state represents an infinitesimally small period, i.e. a discontinuity).

    [edited]

    • dmbarbour says:

      Many issues with callbacks are rooted in formal, semantic problems events.

      More precisely, sequential, infinitesimal delay is the root of many event evils. Consider: you have time-ordered EventStream a = [(T,a)], and two streams that respectively contain [(t0,a1),(t0,a2)] and [(t0,b1),(t0,b2)] (representing events infinitesimally offset from t0). To be globally consistent, how you compose these two streams should depend on how they are globally related (e.g. by earlier dup, loop, split). Without that knowledge, merge order is arbitrary and (as mentioned above) event-stream processing abstractions become fragile. Most of the problems I mentioned above (ontology, stateful views, overlapping views, support for open systems and late-arriving observers) still apply.

      If you guarantee for every event stream that the event times are computably distinct, then you could avoid many of those issues. The ontology issue is mitigated since you’ll now be able to add extra event streams for extra control, rather than extending the events on an existing stream. But this does result in awkward composition operators (i.e. you can’t have zip or merge; you need a composite ‘mergeZip’). And the models still require an abundance of local state, and hinder late-arriving observers from catching up.

      Rather than communicating a button state change event (which observers must process and remember), I would prefer to communicate current button state (so observers don’t need to remember). Where I do need to model events (as for integrating with eventful foreign services), I have idioms to do so – e.g. modeling a time-ordered event log then reporting or influencing the tail-state of this log. Explicit modeling of events doesn’t avoid issues with events, but does make them more obvious and addressable per use-case, and helps discourage non-essential use of events.

      I really am suggesting we eliminate use of ‘event’ abstractions, not just of local state.

  7. shelby says:

    Pondering your comment about the different ways you are thinking about modeling events, e.g. button click, it occurs to me that the problems derive from the time-dependent orders implicit in any stored state.

    When we are declarative, we inherently abstract away unnecessary state.

    For example, instead of modeling a drag-n-drop operation as `drag on mousedown, loop on mousemove, drop on mouseup`, we can model as `while dragging then drop`. This converts the modal on drag initialization function that sets the new state to a pure function which returns the new state while dragging.

    I guess I am realizing that we must define semantics which minimize state and make the unavoidable state very closing coupled to the intended semantics.

    AFAIK, David’s point-free semantics is a generalized paradigm to globally model the multi-signal stream. I still don’t have a good mental model how this is beneficial as compared to for example having an API for consuming any two events as a single one over an allowed interval? What little of understand of it, feels very abstracted away from what the programmer wants to declare.

    [edited]

    • dmbarbour says:

      Assume I press and hold the A button, then press and hold the Z button. With button-state signals, I can easily and statelessly compute a signal that represents when both the A button and the Z button are held down. To achieve the same with events would require keeping state for each of the A and Z buttons, updating that state for each change event, and generating an event whenever we move to or from a state of both being down. Regarding your earlier comment, it is worth noting that “consuming any two events as a single one over an allowed time interval” is insufficient for this observation, since the button presses may be separated by a time greater than the allowed interval.

      To detect a double-click, state is necessary. This is due to the nature of the problem: a double-click involves conditions that hold at different times. When state is necessary due to the nature of the problem (rather than the paradigm), I call it ‘essential’. I do not hesitate to use state where it is essential. I might approach this by continuously recording the button state over time, and concurrently report whenever a double-click pattern is detected in the history. That pattern should be easy to generalize and abstract into an API. If I used button events instead of button state, the pattern wouldn’t change much.

      Several benefits of signals over events do not regard use of essential state, but rather the avoidance of non-essential state.

      This is a point I made repeatedly in the article: event stream processing requires a lot of non-essential state for almost every interesting observation or operation.

  8. dmbarbour says:

    WordPress blog comments are a horrible place for discussion. Shelby and I had a long mostly off-topic discussion, the full contents of which are available as a google doc. I’m editing the remaining posts (including Shelby’s) to cut down on the noise level and keep it mostly on-topic. (See my comments policy under `About RDP`.)

  9. Ruff says:

    Could anyone give a practical example of what is discussed here? Taking the A and Z button example, I wonder how both event driven and state-driven approach would look like for the 2 scenarios:
    1) keys A and Z are observed by the same network node
    2) A and Z keys are observed by different network nodes.

    Does not a repartitioning of a distributed system break the “state contract” i.e. if shifting from scenario 1) to scenario 2), does it not also introduce the same difficulties (inconsistencies, race conditions, times of uncertainty as for state) using state-driven approaches as if event driven approaches were used, eventually introducing the risk of breaking the application which requires the state of A and Z key and its logics?

    • dmbarbour says:

      State systems and event systems do have similar issues regarding inconsistency, indeterminism, and disruption in the communication layer. However, state systems are more generally robust and resilient under these conditions. (For resilience, state updates can support eventual consistency, snapshot consistency. For robustness, two state updates with the same timestamp are commutative, and redundant state updates are idempotent.)

      But only point 10 in the article is concerned with inconsistency, indeterminism, and so on. Events have a lot of difficulties even in a fully deterministic system (see points 1 to 9). So to answer your question: no, states don’t have “the same difficulties [..] as if event driven approaches were used.”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s