I understand `state` to be a summary of the past, maintained in the present, to inform the future. State is a powerful tool, and essential in many problems (such as writing this text). But state is often too powerful – and that power corrupts. State enables momentary inconsistencies of the past to propagate into the future, hindering resilience and complicating failure modes.
The future couldn’t last. We nailed it to the past. – Savatage
State is something to avoid whenever feasible.
Fortunately, many problems (planning, configuration, UI or graph layout, dependency management, resource scheduling and control, etc.) don’t call for state. They only need stability – i.e. inertia, so they don’t vacillate between decisions.
The difference is that:
- For state, the past is meaningful. If we randomize state, we corrupt the program’s whole future. If we reset state, we lose valuable information. If we change the rules guiding state, we introduce inconsistencies or transition problems for the current state.
- For stability, the past is meaningless. If we randomize or reset conditions in a stable system, or even change the rules, we’ll soon converge on a new stable future. At worst, we do some rework and experience brief turbulence.
To achieve stability without stateful semantics, we must eliminate any formal dependency between the present and the past. Yet, we still need some relationship to the past to have stability – not a causal relationship, but a correlation.
Consider a simple example of control software. If we choose the power-state of an air-conditioner (on | off) based on the temperature (76 degrees Fahrenheit) then minute fluctuations in measured temperature between 76.001 and 75.999 could cause significant changes in the power-state. For stability, we want to dampen the system – tiny changes in measured temperature should not cause rapid power-cycling of the air conditioner. What we want is, say, powering the air conditioning system up at 76 degrees and back down at 73.
This is, of course, trivial to achieve by use of explicit state. But we want to do without.
Any deterministic decision made without reference to the past will be volatile in the above sense, i.e. there would need to be a deterministic relationship between observed temperature 75.999 and whether the air-conditioner is powered up or down. Thus I can conclude: To achieve stability without state requires abandoning determinism.
Temperature 75.999 instead effects two states as valid – the air conditioner may be on or off. The present state of the control software is formally non-deterministic. But, informally, we make the actual state stable from instant to instant whenever feasible. With this understanding, we could express a stable air-conditioning example with two declared allowances:
For temperatures less than 76, the air conditioner may be off. For temperatures greater than 73, the air conditioner may be on.
At 74 degrees, if the air conditioner is on, it will typically remain on until the temperature reaches 73 degrees. That’s what stability offers us. But if there is a moment of instability, and we decide the air conditioner should be off, we’ll settle into a new stable system where it remains off until we again reach 76 degrees.
In absence of `state`, the proposed software is blindly controlling the state of the air-conditioner. There is no query of current state to decide next state. This blind control is good enough. We don’t care which state the control software chooses for the air conditioner, so long as it doesn’t vacillate. A volatile pseudo-state and informal stability is simple and sufficient.
As far as program semantics are concerned, we could power cycle our air-conditioner every nanosecond between 73 and 76 degrees. But, in practice, we have very few motivations to implement it that way. We have much greater motivation to favor stable decisions (even more than stability; efficiency is also affected). And so, in practice, we can program and reason about our systems with a justifiable expectation of stability.
Stateless Stable Models
Constraint programming, answer set programming, satisfiability problems, and similar models seem to achieve the right mix of non-determinism and logically instantaneous semantics. The runtime must reactively tweak a solution in accordance with changing conditions, but indeterminism of the solution allows it to be `stable` across much variation in condition.
I’d like to take advantage of state-of-the-art work on SMT solvers. I believe that easy access to SMT solvers could be a silver bullet for many problems. (I’d need to stabilize the solution, anyway, to fit reactive systems.) For RDP, I also need an idempotent, commutative, extensible model, i.e. where each rule is introduced by one demand.
One feasible approach is to leverage the well proven constraint-logic programming model. But, rather than entailment rules, I split the extensions into constraints and allowances. For example, consider the system:
- foo(X,Y) is allowed by X > 2*Y, bar(X,Y)
- foo(X,Y) is allowed by X < Y/2
- foo(X,Y) is constrained by X ≤ 100
- foo(X,Y) is constrained by X ≥ 0
The resulting entailment rule for foo is simply the sum (∨ = or) of allowances times the product (∧ = and) of constraints – in this case: foo(X,Y) :- (((X > 2*Y) ∧ bar(X,Y)) ∨ (X < Y/2)) ∧ (X ≤ 100) ∧ (X ≥ 0). These constraints and allowances are trivially commutative and idempotent, and it is clear that we could `inject` more such rules, which makes the model very extensible.
Note: the RHS of each allowance or constraint uses only (∧), implicitly via comma. Allowances – choices and decision points – can only be introduced at symbols. This rule is for two reasons. First, it ensures that allowances are uniformly extensible. Second, it isolates choice to named symbols, which makes representing a sequence of choices much easier (e.g. for recording or learning from them). It might be better to make this even more extreme and require every constraint simply be a name for a rule (so it is easier to weaken constraints and identify duplicate constraints).
So far I’ve described only a system of constraints and allowances, forming rules with choices.
What happens next is an agent demands a particular symbol, such as `foo` (with no embellishments; for RDP, it must be easy to know when two agents ask for the same thing). The response is a value that meets the constraints – in this case an (X,Y) pair. In an RDP system, the allowances and constraints may change over time. The goal is to provide a stable answer – avoiding and minimizing changes in the answer, and also avoiding changes in choice (for allowances). The approach to stability is heuristic, not part of the program semantics.
Of course, it is possible that there is no answer to foo, or that we don’t find an answer for a long time. The client agent is also continuously informed of this status, i.e. that there is no answer or it’s undecided.
Stability with Optimization and Preference
Seeking an optimum solution is contradictory to stability. For stability, we need to stick with our old choices for a while. Sometimes this means standing by a merely adequate solution. But stability isn’t entirely contradictory to optimization – in particular, if we seek local optima – a local minima or maxima – we can achieve metastability. It would be convenient to leverage these optima to reach better preferred solutions when opportunity presents itself.
I only have a rough idea how to do this: for each allowance add some metadata about (abstract) sunk costs (which are always non-zero and positive) and recurring costs (which might be negative to model gain). The system will seek `better` solutions, where better means a lower total upkeep cost, or even a sentence that pays for itself. The upkeep for a `sunk` cost is zero until it needs to change, but in some environments (where rules are changing rapidly) repeatedly paying sunk costs might be the largest contributor to upkeep. Recurring are paid continuously, i.e. the value might represent cost per second. Preferences would be represented in terms of low recurring costs, and reluctance to change in terms of high sunk costs.
A hiccup is accounting for the cost contribution of constraints. Common cost combiners (+, *) are not idempotent, so we can’t just add constraint costs (unless we’re willing to remove equivalent constraints via symbolic analysis, which I am not). We do want constraint costs to contribute to each decision so we don’t get silly answers that would be impossible to implement. A viable option is `cost of choice + max of costs of constraints` so that if a constraint is repeated twice its costs are only counted once.
Addendum (2012 April 05): the weighted logic dyna might offer some neat ideas on how to express logic with values. It would be convenient to replace those commas (foo(X,Y),(X>Y),etc.) with the cost combiners (foo(X,Y)+(X>Y)+etc.). The value `false` then becomes an infinite sunk cost. But I’m not sure how this would work with recurring costs. Might be better to stick with simple weights (w1*foo(X,Y), etc.).
An interesting property is that these `stateless stable models` can transparently be turned into distributed intelligence systems, possessing both knowledge and creativity. It doesn’t take much: heuristically remember what has worked in the past, and try those solutions first. The intelligence principle is similar to a neural network, except that a constraint system provides its own feedback about what works. Keeping these `memories` becomes a source of stability without semantic state. The system is free to forget and learn new successful search patterns.
Ultimately, this small (but compositional) intelligence provides developers a predictable and stable behavior of a system, even across patterned variations in rules. We don’t need global analysis of change patterns, we’ll simply adapt to them and learn how to find solutions quickly even in a changing environment. Rules change in predictable ways, and learning captures these changes.
I was surprised to find this valuable feature emerging from a simple approach to effective stability. It actually worried me a little. I wrote `build Skynet` on my career goals in high school, but I didn’t actually mean it. Fortunately, secure modularity (object capability model) can ensure such an implicit distributed intelligence is chained to negotiating the interests of humans.
At the moment, I’m short on ideas on how to model the learning itself. Presumably one would need to keep a highly compressed summary of which search paths have worked in the past – enough to guide a search or offer a useful heuristic weight to a success path, but not so much that it becomes a memory burden. Ideally I want fixed memory per node, or perhaps logarithmic with time. More design work is necessary. I’m wondering about approaches involving exponential decay to combine and abstract similar search paths (e.g. based on similar symbols and values).
Conversely, I could probably turn many machine learning models into stateless stable models. For example, I could adapt a dynamic hierarchical clustering algorithm to provide a stable classification of points.
[Addendum Apr 18]: I have learned that the class of dynamic algorithms for time-varying data is called dynamic anytime algorithms. Many of these seem applicable to creating stateless stable models.