Up through early 2011, my visions of RDP still called for `new` (as in `
new Object()` or `
newIORef`). At that time, my vision of an RDP language was a two-layer model: the language would support a separate initialization step for toplevel and dynamic behaviors. But multiple layers was inconvenient, complex, and inelegant no matter how I spun them; I had difficulty reasoning about persistence, live programming, open extension, and metacircular staged programming.
In February 2011, I stopped work on RDP then sat down and tried to understand the problem more deeply. What is needed for persistence and upgrade? How do we ensure state is visible and extensible? What is necessary to avoid those extra layers of complexity and turn RDP into a complete paradigm? I don’t clearly recall my thought process at the time. But, with some combination of stubborn effort and minor Eureka moments, I traced the problem to a concept that had been so culturally ingrained as “good” that I had never questioned it: local state.
The primary cause for problems achieving persistence, upgrade, visibility, extensibility, and live-programming is local state. And I don’t just mean the explicit local state (mutable references and objects). Even implicit local state, represented in continuations, closures, callbacks, message queues, procedural stacks, dataflow loops, etc. will cause the same problems. The issues are inherent to the fundamental nature of local state: state cannot be cheaply recomputed or regenerated like other live values, and because the state is locally encapsulated it is semantically inaccessible to components that might provide persistence, extensions, or support transition of state during upgrade.
Push state just beyond the edges of our program logic, into external databases or filesystems… or to type-rich language-local abstractions that happen to look a lot like filesystems and databases. Modularity, security and exclusivity concerns can be addressed by secure partitioning of a stateful resource space across different subprograms, e.g. similar to chroot jails.
External state is a shared, global state.
In modern programming culture, we are taught that global state is bad, that shared state is bad, and that shared, global state is doubly bad. Belief in the evil of global shared state does seem well justified when presented in the context of imperative programming, multi-threaded concurrency, ambient authority. However, that is attribution error: the context is awful with or without global state. The belief is also just plain inconsistent with the use of databases (a pain felt by anyone who uses an ORM). With sane programming models for concurrency, consistency, and composable security, shared global state is great; only local state is problematic.
If we eliminate local state, both explicit and implicit, our programs become stateless logics manipulating a stateful substrate. Programs become simpler: no need for concepts of creation or destruction. Those concepts are replaced by discovery – potentially in an infinite graph of stateful resources – with default states, and subgraph resets. We compose and manipulate resources that already exist, that continue to persist beyond the lifetime of our program. Orthogonal persistence, resilience, open extension, visibility, runtime upgrade, and many other advantages come easily once we decide to abandon local state.
Of course, it isn’t easy to abandon local state. As I describe of event systems, many of our programming models today have a great deal of implicit local state. To be rid of this is challenging. Even purely functional models like FRP tend to hold onto local state (modeling local integrals and accumulators). To ease transition from local state to proper use of global state, new idioms are required.
I spent most of 2011 March through October designing state models and idioms to support RDP. I didn’t have much luck (beyond tuple spaces) until Reactive State Transition. Use of external state enables RDP to be a complete programming paradigm – in both senses of being Turing complete (via incremental manipulation of state) and sufficient for general purpose programming. Of course, it is still preferable to avoid state if it is not essential.
Since I started blogging only in 2011 May, RDP has never been presented on this blog with the early visions for use of local state. A year after I wrote nothing `new` in RDP, I believe even more strongly that `new` is harmful, that local state is harmful, even if implicit.
I write in my Sirea readme:
A tree-shaped resource space, where each subtree is a recursive resource space, is nearly ideal:
- path names can be stabilized using meaningful domain values
- can be securely partitioned; no path from child to parent
- subtree-level operations feasible: reset, clone, splice
- parent can easily observe and influence child resources
- readily supports orthogonal persistence and versioning
You’re probably thinking, “hey, that’s basically a filesystem!” And you’re right. A filesystem metaphor to manage resources is superior in every way to use of new. The tree structure is discoverable, extensible, auditable, persistable, securable, composable, and suitable for declarative access. With a little discipline to ensure stability of schema and locators, the tree structure effectively supports live programming. The ability to audit and discover resources is useful for visual presentation, developer awareness, and debugging.
I expect there are programming subcultures that already grok the problem, if not the cause – RESTful web architects, users of the Erlang/OTP platform, users of publish/subscribe systems. But I’ve been there, and it still took me years to even recognize my “local state is good, global shared state is bad” prejudice. My mind had been poisoned, probably by Object Oriented Programming.
If you think global shared state is bad, you’re doing it wrong. To achieve large scale, robust, resilient, maintainable, extensible, eternal systems, we must transition away from local state and shove essential state into global, shared spaces where it can be represented independently of program logic.
When we need state, global state is great. Local state is the mind killer.
Clarifications: I’ve had some arguments on the internet recently that boil down to a few misunderstandings. In one case, the other guy was reading ‘local’ and thinking ‘physical local (vs remote)’, which is reasonable but isn’t what I meant. In a distributed system, code and data can be modeled as having locations and partitions, and we can speak of migrating code or data based on access patterns. Distributed filesystems, NUMA, mmap, CPU cache. But the ‘local’ I mean is about state embedded within a software component. Perhaps if I had used ‘internal vs external’ state (internal state is poison!) this confusion would have been avoided. In another case, a different other guy was assuming non-local (external) state must be durable. But consider tmpfs as an obvious counter-example.