Reactive State Transition (RST) is a simplistic state model I developed for use with Reactive Demand Programming (RDP). RST serves a similar role as would a finite state machine. Unlike a typical state machine, RST reacts to observable system state over continuous time rather than a stream of events. In this article I’ll describe the model.
Reactive State Transition Model
A Reactive State Transition model consists of:
- a set of potential states
- a discretely time-varying set of transition rules
- a current state
At each discrete instant, the set of transition rules for that instant is applied to the state from the previous instant, and the result becomes the current state. When there are no valid transitions to a new stable state, the previous state is maintained until the next instant.
RST can roughly be modeled in a traditional state transition system by simply labeling transitions with time, and incrementing time whenever the next available transition is in the future. The advantages of RST come from the monotonic nature of time (allowing easy garbage collection of old transition rules) and that the infinite set of transitions over time is finite at any given instant.
In their basic form, transition rules can be simple pairs – e.g. an NPC in a game might transition from state angry->grumpy, or happy->morose. While strings or symbols are possibilities for naming states, simple integers should work well in most cases (and cover any enumerations).
In a reactive system, the set of transition rules is provided by the environment, independently of the current state. For example, an air-conditioner might be modeled with states OFF and ON, but existence of the transition rule from OFF->ON, or vice versa, depends on the observed temperature.
When transition rules lead in a complex circle, what is the current state?
I think a natural answer is: ambiguously, the full circle – i.e. every state within it. I envision this as sort of a photon racing around the circle so fast that I see a continuous beam.
However, at the beginning of the next instant, the transition rules might change and the circle might break into a thousand little pieces. Which ‘piece’ of that cycle the photon is in is effectively a non-deterministic choice of all the states in the cycle. Thus, the ‘correct’ answer to record should be based on how non-deterministic selection is modeled.
RST is not generally deterministic: at some instant, there may be more than one available transition from a given state. Non-determinism is not a nice when reasoning about a complex system. Fortunately, it can be handled easily enough. A few possibilities to tame non-determinism:
- Model the non-determinism explicitly maintaining a ‘set’ of possible current-states across each instant. Unfortunately, this may have unbounded state resource costs, which is a bad thing.
- Introduce a simple preference heuristic and apply it to the set of possible current-states for each instant. For example, if states are labeled by integers, always select the least value.
- Similar to option two, but apply preference to individual state transitions (i.e. greedy choice). This option is likely to lead in a fixed cycle, in which case choose the preferred state from the cycle.
I reject the first because my desiderata include predictable long-term memory costs. The second option is less dependent on path, which should make it easier to control when cycles exist (i.e. developers can reason about ‘any path’). But the third option is easier to implement, the most efficient, and easy to teach – simply follow the best available link and record the best available state in case of a cycle.
Of course, this is a solution to corner cases – ambiguity and cycles. These are problems that developers should avoid by controlling the transition ruleset. Such corner cases should not increase average-case costs. With this criterion, the choice between options two and three is obvious: only option three does not significantly increase costs in the normal case.
An interesting note: choosing the state with the least-valued label, assuming integers, is actually quite flexible: If developers want stability, they use positive labels (so that state tends to flow towards zero). If developers want unbounded progress, they use negative labels (so that state tends to flow towards an infinity). I offer a physics metaphor, of state flowing towards a lower energy state. With this understanding, option 3 models a ‘path of least resistance’, which is a simple path-dependence.
When I initially developed these options (and in my first published draft of this article) I favored option two, but I now strongly favor the third option.
Ranged Transition Rules
When states closely related by concept happen to be closely related by label, developers could easily benefit from ranged transition rules. Consider:
- (* -> 5) – every state can transition to 5.
- (4..9 -> 13) – values between 4 and 9 (inclusive) can transition to 13. This will only apply, of course, if they don’t already have a transition to a value less than 13.
- (108.. -> 42) – values greater than or equal to 108 have a possible transition to 42.
- (..3 -> 4) – every value less than or equal to 3 has a transition to 4.
These ranged transition rules can be processed efficiently, and occasional ‘*’ transitions provide extra confidence about the current state of the model. A (*->5) transition will guarantee a state less than or equal to 5 depending on the other transitions.
Developers could leverage ‘*’ transitions to model simple integer variables whose value is simply based on the most recent assignment. This would be most useful when assignments don’t often conflict, where it would correspond closely to integer assignments in an imperative language.
Reactive State Transition for Reactive Demand Programming
Reactive Demand Programming (RDP) is designed to control external state models. However, not all state models work nicely with the constraints imposed for RDP demand-effects. Reactive State Transition is a good candidate. Assuming integer-labels and the least-integer preference heuristic, the following characteristics make RST effective for RDP:
- Transition-rules are applied as a set, meaning the transition rules are (at every instant) idempotent and commutative. These are basic criteria for demand-effects in RDP.
- Transition-rules from multiple agents easily compose, as a simple union. The current set of transition rules is thus easily constructed via collaboration from multiple agents. RDP forbids link-level exclusive control (due to the ‘spatial commutativity’ property). In RDP, all primitive state models must be collaborative. (Exclusive control can be modeled using intermediate state, but is not recommended in any case!)
- RST state is a stable fixpoint relative to the demands – i.e. the state won’t change unless the demands change, even across multiple ‘instants’ with the same transition rules. This offers a form of temporal idempotence, which is critical for discrete state models in RDP. This issue concerns the difference in time models: RDP uses continuous time and RST uses discrete time, thus conceptually there is an ‘infinite’ number of RST instants using the current transition rules between each RDP update of the transition rules. In RDP, discrete state must, at every instant, be a fixpoint of the current state and the demand.
- State is also stable across distinct instants – i.e. changes in transition rules often cause no change in state. This is valuable for scalable composition, making it easy to cache and choke the updates and memoize values and ultimately reduce the amount of re-computation.
- RST systems are weakly composable. That is, multiple RST systems used together can be modeled as one big RST system (with a combinatorial number of states), so long as they share a common control behavior. There is some loss of fidelity with respect to atomic transitions if one distributes control in addition to the state.
- RST systems are easy to model as external and eternal, especially with integral labels where one can assume a default/reset value of zero. Orthogonal persistence is also easy to achieve. These are important properties to eventually support a pure-RDP application.
- RST state is deterministic due to the preference heuristic and can be anticipated easily, and consistently, as a function of the current state and anticipated transition rules.
- RST has predictable costs: constant storage costs, and update costs that are linear per update with the number of transition rules.
These are very nice properties for a state model to have. RST is not unique in possessing them (e.g. tuple spaces also work, as do some constraint models), but RST may well be the simplest widely useful state model for RDP.
The integration between RDP applications and RST should be very simple:
- Developers discover an ‘eternal’ RST system in their environment, e.g. via a directory idiom (ultimately provided by the runtime).
- Each RST system provides two capabilities: one to influence it, one to observe it.
- To influence the RST system, provide a demand signal containing a set of transition rules, which include simple transitions and possibly some ranged transitions. The set of transitions is based on the environmental conditions (and NOT on the current state), and thus the RST can be understood as reacting to the environment.
- The RST system processes the union of demands to influence state at each update to any of the demands.
- Observing the RST system returns a simple discrete signal containing the integer.
RST will serve the same roles in RDP as an FSM model would in an imperative or event-driven model. There are many potential roles, here, including simulation or game state (e.g. character emotion for an AI, commitment to a plan, tracking room-based or containment locations of objects), GUI state (e.g. toggles, radio buttons), and stabilizing control state (e.g. so an air-conditioner doesn’t vacillate on and off at some temp threshold).
State Transitions vs. Rewriting
I started out pursuing these more general models, and I still believe that reactive rewriting is a very promising direction. The difficulty is that they are not ‘simple’. It is difficult, in particular, to analyze a set of rewrite rules and determine whether they terminate on a fixpoint – which is a requirement for a discrete state model in RDP. It would be difficult to understand performance and storage costs in such a model.
Reactive State Transducers?
A ‘transducer‘ is a state machine that labels the transitions with discrete actions to perform whenever that transition occurs. An RST variation on transducers is straightforward enough, and could be useful in a discrete-time reactive model.
However, transducers are generally incompatible with Reactive Demand Programming. This incompatibility is caused by the differences in time models combined with the potential for cycles: RDP uses continuous time, and RST uses discrete time. Conceptually, RST will experience an infinite number of ‘instants’ between each RDP update. For state, this works out – state is already a fixpoint. For actions, however, it only works out when there are no cycles. Conceptually, RST state is running that cycle an infinite number of times, but there is no way to perform ‘infinite’ actions – at least, not without guaranteeing that actions always have a fixpoint, which would constrain the transducer to below a threshold of reasonable utility.
So I reject RST as a transducer in the context of RDP, at least in the internal sense (within the transition rules). RDP does allow externally observing state transitions and taking appropriate action. I strongly disfavor this sort of event-based idiom, but such transition events are easy conditions to detect, statelessly, by use of delay and anticipate.
[edited heavily Oct 05]