Local State is Poison

Up through early 2011, my visions of RDP still called for `new` (as in `new Object()` or `newIORef`). At that time, my vision of an RDP language was a two-layer model: the language would support a separate initialization step for toplevel and dynamic behaviors. But multiple layers was inconvenient, complex, and inelegant no matter how I spun them; I had difficulty reasoning about persistence, live programming, open extension, and metacircular staged programming.

In February 2011, I stopped work on RDP then sat down and tried to understand the problem more deeply. What is needed for persistence and upgrade? How do we ensure state is visible and extensible? What is necessary to avoid those extra layers of complexity and turn RDP into a complete paradigm? I don’t clearly recall my thought process at the time. But, with some combination of stubborn effort and minor Eureka moments, I traced the problem to a concept that had been so culturally ingrained as “good” that I had never questioned it: local state.

The primary cause for problems achieving persistence, upgrade, visibility, extensibility, and live-programming is local state. And I don’t just mean the explicit local state (mutable references and objects). Even implicit local state, represented in continuations, closures, callbacks, message queues, procedural stacks, dataflow loops, etc. will cause the same problems. The issues are inherent to the fundamental nature of local state: state cannot be cheaply recomputed or regenerated like other live values, and because the state is locally encapsulated it is semantically inaccessible to components that might provide persistence, extensions, or support transition of state during upgrade.

The solution?

Push state just beyond the edges of our program logic, into external databases or filesystems… or to type-rich language-local abstractions that happen to look a lot like filesystems and databases. Modularity, security and exclusivity concerns can be addressed by secure partitioning of a stateful resource space across different subprograms, e.g. similar to chroot jails.

External state is a shared, global state.

In modern programming culture, we are taught that global state is bad, that shared state is bad, and that shared, global state is doubly bad. Belief in the evil of global shared state does seem well justified when presented in the context of imperative programming, multi-threaded concurrency, ambient authority. However, that is attribution error: the context is awful with or without global state. The belief is also just plain inconsistent with the use of databases (a pain felt by anyone who uses an ORM). With sane programming models for concurrency, consistency, and composable security, shared global state is great; only local state is problematic.

If we eliminate local state, both explicit and implicit, our programs become stateless logics manipulating a stateful substrate. Programs become simpler: no need for concepts of creation or destruction. Those concepts are replaced by discovery – potentially in an infinite graph of stateful resources – with default states, and subgraph resets. We compose and manipulate resources that already exist, that continue to persist beyond the lifetime of our program. Orthogonal persistence, resilience, open extension, visibility, runtime upgrade, and many other advantages come easily once we decide to abandon local state.

Of course, it isn’t easy to abandon local state. As I describe of event systems, many of our programming models today have a great deal of implicit local state. To be rid of this is challenging. Even purely functional models like FRP tend to hold onto local state (modeling local integrals and accumulators). To ease transition from local state to proper use of global state, new idioms are required.

I spent most of 2011 March through October designing state models and idioms to support RDP. I didn’t have much luck (beyond tuple spaces) until Reactive State Transition. Use of external state enables RDP to be a complete programming paradigm – in both senses of being Turing complete (via incremental manipulation of state) and sufficient for general purpose programming. Of course, it is still preferable to avoid state if it is not essential.

Since I started blogging only in 2011 May, RDP has never been presented on this blog with the early visions for use of local state. A year after I wrote nothing `new` in RDP, I believe even more strongly that `new` is harmful, that local state is harmful, even if implicit.

I write in my Sirea readme:

A tree-shaped resource space, where each subtree is a recursive resource space, is nearly ideal:

  • path names can be stabilized using meaningful domain values
  • can be securely partitioned; no path from child to parent
  • subtree-level operations feasible: reset, clone, splice
  • parent can easily observe and influence child resources
  • readily supports orthogonal persistence and versioning

You’re probably thinking, “hey, that’s basically a filesystem!” And you’re right. A filesystem metaphor to manage resources is superior in every way to use of new. The tree structure is discoverable, extensible, auditable, persistable, securable, composable, and suitable for declarative access. With a little discipline to ensure stability of schema and locators, the tree structure effectively supports live programming. The ability to audit and discover resources is useful for visual presentation, developer awareness, and debugging.

I expect there are programming subcultures that already grok the problem, if not the cause – RESTful web architects, users of the Erlang/OTP platform, users of publish/subscribe systems. But I’ve been there, and it still took me years to even recognize my “local state is good, global shared state is bad” prejudice. My mind had been poisoned, probably by Object Oriented Programming.

If you think global shared state is bad, you’re doing it wrong. To achieve large scale, robust, resilient, maintainable, extensible, eternal systems, we must transition away from local state and shove essential state into global, shared spaces where it can be represented independently of program logic.

When we need state, global state is great. Local state is the mind killer.

About these ads
This entry was posted in Concurrency, Language Design, Modularity, Open Systems Programming, Security, State. Bookmark the permalink.

46 Responses to Local State is Poison

  1. I arrived at the same conclusion while coming from a different perspective. Btw, A shared global state could also been seen as a document (with subsections, etc). A limited perception of this global state (like a document’s subsection) could be useful thought.

  2. Kevin Edwards says:

    Yay for semantic accessibility! :)

  3. xtofl says:

    Might “My mind had been poisoned, probably by Object Oriented Programming.” be of the same impact as “Goto concidered harmful”? I like this article.

  4. I feel like you are confusing “global state” and “state that is globally accessible”.

    The argument that “global state is bad” only applies to situations where resources are contending for and modifying that global state. Global state is also clearly a necessary evil for basically anything that’s going to disk, otherwise you just write garbage.

    Giving state to “somebody else” and calling it “not local” seems disingenuous. Programming is basically just a series of data transformations. We’re basically moving bytes to and from hard disks with screens in-between, but it’s all just stages of transformation.

    At each stage in the transformation you need some form of data to transform. When you say Local State is Poison, my gut reaction is “then what are you transforming?”. If you have no local state then you have no data. If you have no data, then what are you transforming?

    • dmbarbour says:

      I’ll try to address your complaints in order: First, global state is clearly necessary, but not clearly evil. Second, “locality” is always a matter of perspective – if state is pushed to another subprogram, it certainly isn’t local. If you keep pushing the state moves eventually outside of every subprogram in the application, at which point it is global state. Third, “data” does not imply “state”; state describes keeping a summary over time – e.g. accumulators, integrals, even a clock (a stateful record of the passage of time). You can transform data without keeping stateful summaries, e.g. by restricting pure functions to the spatial dimension.

      I hope this has clarified some things for you.

      • Point taken about your difference between “state” and “data”.

        You can transform data without keeping stateful summaries…

        Sure, but now you have figure out how to track what your code is actually doing :)

        If I am running some type of service in production, it’s not enough for that service to simply “exist”, it needs some way of telling me that it’s doing work. Those are the counters and accumulators etc.

        Sure I can “offload” this state to someone else. This basically amounts to taking the state and outputting it as data for “someone else” to consume. So now I don’t have any local state and simply have “someone else” holding that data.

        Awesome, now I have a running service that inputs data and outputs transformed data + counters data. I have achieved a stateless service. The whole thing just listens for incoming data requests and pipes out answers + logging/counters.

        Awesome.

        But how do I know which DB I’m connecting to? Isn’t that just state?

        And who is that “someone else”? Is this another program that I write whose sole job is to manage and report state (i.e.: a database)? But if I write a DB, I’m suddenly writing local state again….

        You can’t just “put local state in someone else’s system” because eventually “someone else” is “you”. Good programming isn’t about removing state, it’s about good well-structured management of that state.

        But I’ve been there, and it still took me years to even recognize my “local state is good, global shared state is bad” prejudice…

        Maybe we’re just speaking at cross purposes here. You are correct that we want to limit the amount of “state” that we are storing in local volatile memory. But that’s already a well-known problem.

        Modern development paradigms don’t new up anything. They use dependency injection frameworks. They use declarative resource access. They implement logging as a cross-cutting concern using aspect-oriented-programming. They run MVC web frameworks with layers that fire up, transform and shut down with no lasting effect on other sub-systems.

        But then if this stuff is the solution, do we really have a problem at all?

      • dmbarbour says:

        Awesome? Indeed! :)

        You seem to be hanging on an apparent “infinite regression”: If we push state ever towards the edge of the system, then where is the logic for that state? Well, there are many valid answers: it could be part of the language standard, or it could be an external service like the filesystem or OpenGL. These are fair answers. Compare: Where is the program logic defining a mutable variable in a procedural language? Where is the program logic specifying how we render a framebuffer to screen, or interact with a filesystem? There is, of course, a bootstrap layer involved with switching to systems that have only external state. Perhaps a review of similar issues faced by meta-circular evaluators might help you understand why conceptual infinite regression is not a problem in practice (and is actually quite elegant, in an “everything is a” sense).

        If we are provided only simplistic state model, we can use program logic to build more sophisticated state models above them. We can implement a DBMS above a filesystem, or similar.

        Re: if this stuff is the solution, do we really have a problem at all?

        As you say, modern development paradigms require sophisticated frameworks, boiler-plate, best practices, and hackish uses of reflection. Developers spend more time working around their paradigms and languages than working with them. Even the small remaining amount of local state used by disciplined developers causes very real maintenance challenges, requiring restarts when tuning configurations or after minor errors. The resulting systems tend to be over schedule, over budget, overly complex, and buggy.

        I believe this is a problem.

      • Thanks for your reply below, now I understand where you are going with your “RDP”. Unfortunately, I was not able to pick this up while reading through several of your blog posts. I just never found the connection between “hey this IOC stuff is useful but convoluted” and “why isn’t IOC baked into my language?”

        BTW, I do agree that we likely have too much overhead in the typical programming stack. I have definitely seen some crazy over-engineered Java code where applying these best practices involves writing 5 classes to implement a function that performs addition.

        I guess it would be great to implement this at the language level (hey duck-typing), but I’m not sure what it really means to have a language that doesn’t provide a new() operator and requires a configuration file just to get started.

        At first glance, it feels like the immediate trade-off for any language that “bakes in” all of the technologies I mention is that suddenly that language has a lot of “hidden” things going on. Adding IOC by default suddenly increases the learning curve for that language unless you can hide it appropriately. But if you “hide” it, then you end up having “magic swtiches” to expose that functionality as well.

        hmmmm, now that I’m finally on the same page, I’m not sure I have an easy answer :(

      • dmbarbour says:

        In context of reactive programming and orthogonal persistence, a user-input file (aka configuration file) is a better alternative to interactive console input. Such a file both enables a user to adjust his inputs and hangs around in case of a program restart or change in program code. (Of course, we could use a more structured data store than a file.) I would not build the notion of configuration files into the core of a language, but it is a useful idiom that would be a decent part of a standard library for a declarative language – similar to how console IO is a standard part of most imperative languages. It may help to think of configuration files as the declarative version of CLI – optional, but convenient.

        Regarding the Inversion of Control (IoC) arguments: much of IoC’s complexity is a consequence of its stateful embedding in procedural control-flow languages. In a different context, such as spreadsheets, the benefits of IoC are achieved by reactive dataflow concepts. Consider: the dataflow in a spreadsheet is neither hidden from the user nor difficult for users to understand. Rather than “baking in” design patterns such as IoC with all their hidden complexity and implicit local state, it is worthwhile to distill their essential aspects and infuse them carefully into a language. This is not an “easy answer” for the language designer (simple isn’t easy!), but it does make things simpler and easier for the eventual language users.

  5. Brian Balke says:

    Your resource tree sounds like ORBs or DCOM to me.

    • dmbarbour says:

      I believe we frame concepts in terms of what is already familiar to us. I am not familiar with ORBs or DCOM, but I do understand they use unique identifiers for connectivity.

  6. Yannbane says:

    Entity systems seem like an implementation of what you’re describing here.

  7. Max Vlasov says:

    Although it can work, there are some aspect where I find it difficult to change approaches. Current situation might not be ideal, but local state and context allows quick reuse and adaptation. For function or class reuse you have to do very few things and for introducing a slightly changed functionality also (since for example a class already has a local context any new method can use immediately). If you’re talking about a piece of code that finds everything it need without local context by using some embedded rules (for example several XPath-like paths for global xml access) then it’s much harder to adopt to any other program/context. In other words, modern conventional programs is “how” and is usually made once or modified rarely, other part is “with what” and it changed much frequently while developing and while executing, implicitly or explicitly. Moving to global state means that we’re making “with what” semi-freezed and less flexible. Can you ease my anxiety? :)

    • dmbarbour says:

      Local state actually makes adaptation and reuse much more difficult than would external state. Given a subprogram that uses internal state, you cannot easily observe or influence it except by invasive modification of that subprogram. External state is far more accessible, and better supports open extension and collaboration (cf. blackboard metaphor, tuple spaces, data buses, file systems). The same is true for client state: if client state is pushed external to the service (e.g. using unhosted.org patterns) clients gain considerable ability to augment their state with support from third-party services. No invasive modification or monolithic rewrites are necessary. To reuse a subprogram or app multiple times only requires partitioning the external state resources (e.g. different directories in filesystem), then running an independent copy of the subprogram in each partition (with slightly different parameters or inputs).

      Even eschewing local state, one can still use parameters, traits, and other subprogram modifiers for abstraction and reuse. One can encapsulate, transform, attenuate, and compose references to an external environment, all without introducing local state. The notion of local context does not need to be stateful. The “with what” is actually MORE flexible in the absence of local state: there is no risk of losing important work held by local state when we reconfigure at runtime, not even in third-party libraries or frameworks, so developers have much more universal freedom to reconfigure and change “with what” even at runtime.

      You might be interested in a related article: life with objects.

      • I absolutely agree with pushing state to the outer skirts of a system. I also believe that programs and their evaluations should be observably stateless, but that’s another topic.

  8. pere88 says:

    Take a look at Bigraphs: locality of resources is achieved by sorting disciplines. that restrict the use of global resources mimiking localities likewise a type system operates onver a free language to enforce some constraints.

    • dmbarbour says:

      Ubicomp bigraphs are very relevant. The sort of “locality of resources” achieved by stable partitioning of a global “place” tree is exactly the sort I promote in place of traditional local state, quite effective for modularity and performance. Portions of the ‘link graph’ could be modeled with dynamic behaviors or objects that encapsulate and hide parts of the place tree (to achieve nice security properties). Thanks for the ref.

  9. Grant Husbands says:

    Does this interact badly with some of the encapsulation necessary for capabilities? In particular, it seems to turn the construction of new objects into the construction of new subtrees and the creation of a new copy of the subprogram using that subtree, but that would mean that any code that instantiates ‘objects’ now immediately earns the ability to interfere in the internal state of that object. Would that not break the internal instantiation in the use of a sealer, for example?

    If the answer is that the sealer is created elsewhere in its own subtree and creates further subtrees on demand, that would seem to negate the debugging advantages and make the sealer entirely opaque to the code using it, making it again object-oriented.

    My mind is probably poisoned by OO, but I’m eager to fully comprehend this model.

    • dmbarbour says:

      Object capability security was one of my initial concerns. Fortunately, it seems a tree-structured resource spaces work very well with ocaps. A capability will grant or encapsulate access to a resource (e.g. a particular file or subdirectory) without granting access to that resource’s location (e.g. the parent directory). We partition a parent space into a child space for each subprogram, then we use capabilities to secure interaction between mutually distrustful subprograms. Fractally, we apply the same principle downwards to sub-subprograms, and upwards to superprograms.

      Re: any code that instantiates ‘objects’ now immediately earns the ability to interfere in the internal state of that object

      As a developer of a subprogram, I can instantiate `objects` in MY space (e.g. partitioning a local resource tree). But, through capabilities, I might also instantiate or interact with logical objects in YOUR space. I have the full ability to observe, audit, persist, extend, reset, etc. the states of `objects` instantiated in my space. I do not have the same authority over objects instantiated on my behalf in your space.

      For the sealer/unsealer example, consider: if I create a traditional `new` sealer/unsealer pair, I have much initial freedom to wrap the seal and unseal capabilities before distributing them – e.g. to record every object that gets sealed, even manipulate objects. However, with traditional OOP, I would require foresight to include those wrappers. Using an external space doesn’t offer any additional authority, but does a much better job of preserving authority over time (thus reducing need for foresight!). Whether a sealer/unsealer pair is modeled in my space, or yours, or that of a trusted third party, depends on whose interests are being protected.

      • Grant Husbands says:

        Thanks for the elucidation. I believe I now understand the model and have no useful questions to add. It does seem potentially powerful.

        Just a minor niggle, but I think “global” implies too much; the state as described appears to be contextual, and little code will actually have access to the root, by my estimation.

  10. Pingback: Reactive Knowledge Networking « Pipe Dreams

  11. Pingback: Episode 9 with Kevin Lynagh and Paul deGrandis: web dev ennui, CRDTs, and core.logic « Mostly λazy…a Clojure podcast

  12. Pingback: Ad-Hoc External State Models | Awelon Blue

  13. David Clark says:

    Although I sympathize with your problems with state and your RDP project, your conclusion is quite wrong. I do believe that “global state” is bad unless it is moderated by a wrapper that can enforce security and exclusive use. i.e. Global state in C is generally bad but a flexible interface to a global service is good. Un-moderated access from potentially multiple sources or even un-trusted sources is not conducive to tracking a problem with state. If you out source all your state to a RDMS, you still have the problem of not knowing where the source of your problem lies (even with a security system) if more than one program module can change the state in the database.

    In any organization, responsibility for anything and everything must be hierarchical and singular. If nobody has responsibility or more than 1 person holds responsibility for anything then confusion or lack of oversight is the consequence. All data should have exactly 1 owner. That owner can execute requests on it’s data from others or even temporarily delegate their authority to some other entity but all data should always have at most and exactly 1 entity that is responsible for it.

    Your solution of pushing your state problem into a database just means that you still have to hunt to find the source of the state errors and you are constrained in your data structures to that of the database that is outside of your control. I could also mention that some processing is at least an order of magnitude faster if it resides in direct contact with the data.

    I end this with an analogy from my programming in APL (1975). I made a program that had 47 functions on a single line. Remember that a function name in APL is a single character in width. To create this line I had to place a “display” function in that line to make sure that the data structure that was being manipulated was being created correctly. As the correct code emerged, I moved the “display” window across the line until I had made the line of code. The state wasn’t stored but was “passed” along between the functions and this resulted in a 47 function line that worked (but was very inefficient) and was totally unmaintainable. APL was so notorious for opaque code that many practitioners would just rewrite a function from scratch rather than try to modify the code. Functional programming reminds me a lot of APL except that data was persistent by default and you could look at the current state of that data without touching the functions that manipulated them.

    I think the biggest problem with your conclusion is that programming is about data and programming, not just programming. In my current project (written in C), I have many functions that are “pure” in the functional sense. I consider these to be useful if small “worker” functions. I also have another group of functions that manipulate a certain kind of data structure. I pass these kinds of functions a pointer to that data structure but I would use an object if C had such a facility. I have other functions that are quite specific and sometimes a few hundred lines in length. The point is that I require more than one kind of function. “Pure” functions have their use but so do functions associated with some specific state (object like) or other.

    If state has only local access and there is a guarantee that one and only one entity can have authority over that state (at least at any moment in time) then problems with state can be easily debugged and reasoned with. This means encapsulation to me and that and message passing are what I care most about in OOP code. I have heard some people say that OOP is good at the large and stateless functional programming is better at the small. I use “pure” functions wherever I like in all my projects (no functional language needed) and I don’t believe that functional programming has a lock on this concept any more than it has a lock on “first class functions”. OOP style programming can easily have “pure” functions. From this perspective, I think functional programming is a subset of other paradigms except that functional programming forces all functions to be of one type only even if that isn’t appropriate.

    If you find my comments about functional programming off topic, I apologize, but your article seems to be mostly about your reasons for abandoning OOP programming and embracing stateless functional programming.

    • dmbarbour says:

      Exclusivity and security are orthogonal. I do favor securable state models – e.g. with partitioning, sparse capabilities, or cryptographic mechanisms. And even with external state, developers can reason about exclusivity (with global reasoning, design, discipline).

      hunt to find the source of the state errors

      I believe external state has the advantage here, due to accessibility for debugging and greater potential for on-the-fly programming.

      some processing is at least an order of magnitude faster if it resides in direct contact with the data

      I agree. There are many ways to address this concern – e.g. code distribution, or making a compiler that is data-model aware.

      you are constrained in your data structures to that of the database that is outside of your control

      This is a legitimate concern for external state, and I tackle it in another article. That said, it is not a critical concern if ignored: developers can do a lot with just a few standard data structures (e.g. matrices, lists). Developing new data structures is more often a didactic effort than a practical one.

      abandoning OOP programming and embracing stateless functional programming

      I am not recommending we stick to pure functions. I do believe OOP would be better, too, if it modeled state to be external to every object (as mentioned in life with objects). An object can encapsulate authority to state without encapsulating the state itself.

      • David Clark says:

        Fortunately, encapsulation is not the only way to protect state invariants.
        It might not be the “only” way but it does make the functions and data those functions work on reside “in close proximity” so that they can be worked on or checked without searching. Also, if the language guarantees that changes to those variables can only be done with a small limited set of functions then you should never have to search globally.

        In my long programming career, nothing is worse than having bugs in code that are physically in different source code modules. This is the main reason why “global state” in C is so bad.

        You lose access to the concepts of creation and destruction.
        Good! Why should a programmer be bothered with memory allocation (either way)? If memory management is “taken care of”, then why isn’t that a step forward?

        RDP is highly suitable for time-aligned data.
        How would this system deal with inventory data? The transactions that change the quantity on hand might be time-aligned data but what about the “state” of a particular item? How does a flow system model multiple users accessing that inventory or printing an invoice? I understand the usefulness of DSLs but if a language is to be general purpose, even if at a very high level and not suitable for system programming, don’t you think it should be able to handle a simple inventory case?

        Imperative, mutable state variables with get and put actions are a very poor fit for RDP.
        If get and put functions are allowed, then those variables are global by definition. Not good, I agree. I believe that all changes to encapsulated data should be done by functions within that encapsulation but I mostly prefer collections of encapsulated objects rather than single objects and that means more changes by local functions rather than external ones.

        As an overall view, I think you hint at problems that are not prevalent. An example is your view about abstraction. “But this answer is unsatisfying because it kicks the can, eschews abstraction, and overly empowers the architect.” “Unsatisfying” isn’t a good argument! What’s wrong with “empowering the architect”? Why recreate the wheel when you get a very good one for free?

        I agree that “encapsulating behavior” is a good idea but why take away the data that these functions work on? You say “But behavior encapsulation introduces its own challenge. There is overhead to make the behavior available to its users. There are concerns regarding performance, partial failure, and consistency. Extensions might bypass the view. Ability locally reason about correctness is diminished.” All programming has it’s challenges! Of course there is some overhead in all code and if not to users then to who? Performance should be higher if good tools are provided to programmers rather than have them create them for themselves. Failure of any code we have to be dealt with all the time. Go (from Google) uses multiple return values from functions to signal failure and I have come up with a “global system variable” (not the same as a global variable) that signals that a function failed whether system sourced or user created. Why is correctness diminished?

        I advise against events for future programming languages.
        Many people, including me, believe that future programming that doesn’t interface directly to the web will be rendered obsolete. Although my project uses message passing to create a form of “event programming”, I wouldn’t call it that exactly. How would you handle a web application that needs to react to many small requests at the same time? I believe that web services can be created in any PL and that gives all the new languages and little guys a kick at the can that they never had with normal desktop software. I see a future where integration will be at the “websocket” layer or similar communicating protocol and the underlying language will be of no concern (to anyone except the developer of course). The basic idea of a PL being a fancy calculator (input, manipulation, output i.e. no state) might make a good DSL but not a general purpose PL.

      • dmbarbour says:

        I agree that encapsulation can be useful for debugging of stateful applications. However, other tools (e.g. edit history, connectivity, and live feedback) are equally useful. And these other tools are more readily (generically, consistently, and conveniently) implemented for ‘external’ state resources than for ad-hoc encapsulated state interfaces. Thus, even if we lose encapsulation, we can address the issue of debugging stateful code, and the resulting environment may even be superior.

        Creation/destruction isn’t memory allocation. Creation includes new GUIDs, genSym, random numbers, mutable stack variables, even new actors or messages – anything that has a beginning, birthdate, or new identity. Losing creation and destruction isn’t bad, but requires an alternative set of idioms and design patterns. The differences will turn many people away without serious consideration, which is unfortunate. One can indirectly model creation/destruction in a system that lacks them implicitly, but it’s a hassle.

        I can’t think of any better way to keep an inventory except in external state, perhaps accompanied by a stateful model of long-running transactions (outstanding requests, commitments, etc.). An RDP-based inventory system would mean “RDP does the system integration and interface” while there are external state resources (formally outside of RDP) to keep the inventory. RDP is not a language; a general purpose RDP-based language (like Awelon) must address the issue of providing external resources in the first place, whether it be state, standalone apps, or an API for web application servers. I describe in a recent article how I plan to approach this in Awelon.

        Empowering the architect is fine. Overly empowering, not so much. Even if you don’t know the “best” place and time for a particular decision, you can find plenty of decisions for which “at architecture design” is clearly not the best place and time. There are many easy answers, but they’re all wrong.

        How would you handle a web application that needs to react to many small requests at the same time?

        I would tend to model “the same time” in a logical sense (i.e. using a `T=` value in the URI or data). Or are you asking for performance tips? I’m already developing a model for RDP-based web-apps and servers, where a ‘page’ is a behavior that operates on the client side (i.e. code distribution). And RDP makes it easy to batch signal updates whether over HTTP or WebSockets.

      • David Clark says:

        However, other tools (e.g. edit history, connectivity, and live feedback) are equally useful. And these other tools are more readily (generically, consistently, and conveniently) implemented for ‘external’ state resources than for ad-hoc encapsulated state interfaces. Thus, even if we lose encapsulation, we can address the issue of debugging stateful code, and the resulting environment may even be superior.

        Why either/or? Why not both? What if the encapsulated state could be made fully available to only a select group of outside programs but only while the owner is suspended from using the state itself? A special interface could be created that only those programs would have access to, that could expose all the encapsulated state securely while providing all the benefits of encapsulation most of the time.

        The differences will turn many people away without serious consideration, which is unfortunate. One can indirectly model creation/destruction in a system that lacks them implicitly, but it’s a hassle.

        You used the word “new” which is the memory allocator in C++, hence my thinking you meant memory management. I see mutable state as either persistent or temporary. The temporary is created wherever you need it and it’s destruction is automatic when your message has been answered. I say it this way as temporary variables don’t need cleaned up after a function call so, answering 1 message could be a single function call or thousands or millions of function calls. I see nothing “unusual” about not having to be concerned about memory management at all. In fact, no hassle at all.

        RDP is not a language.

        Then it is some kind of DSL and the techniques of general PL design don’t apply. I thought your comments about “no local state” and “no event model” were directed at general PL. My mistake. By the way, most programming is done on “inventory” type systems. Most programmers spend most of their time programming the user interface, then handling the data (input, reports etc) and the smallest amount of time processing in a stateless processing environment. Any general language that ignores the first 2 of these 3 activities can only be marginally useful to that general programmer audience.

        Overly empowering, not so much.

        I fail to see how you can “empower” a programmer too much. It is their project after all. They may write it well or not but in the end, it still belongs to them. My philosophy is to provide as many power tools to the programmers as possible so that they spend more brain time on their project and less time (or no time) thinking about CS techniques and gotchas. If my language and system can become absolutely invisible to the programmer then I succeed. I care about the speed and efficiency of my language but I care more about the speed of creating and maintaining the code and data. Most code will run perfectly well even if it has not been tweaked for efficiency by the programmer or the language. Most code doesn’t get much use and that it works is more important than how quick it runs.

        Or are you asking for performance tips?

        No, I have lots of experience at making things go fast. I think sometimes I care too much about the speed of my systems. 2 times instantaneous is still instantaneous!

        Your comment about programming in a browser as well as on the server is interesting but it reminds me of the discussion about batch processing and time sharing from the 1970’s. Some people said batch processing was more efficient and used less resources but the time sharing people said that everybody should at least get something right away. As it turns out, batch processing could be run as just another time sharing user and so time sharing won the day. It could do both. In my message passing system, I can handle any number of requests at the same time and also run longer “batch” jobs as needed. Any general purpose PL that doesn’t do the “time sharing” thing, even if it is hard to create or less efficient, will not succeed in the future. The caveat is that this doesn’t apply to a DSL. As a developer of a “Content Management System” that runs in a browser as well as on the server, I genuinely wish you luck working with Javascript and all the slight differences in implementation on the various browsers. In the end, my solution was to do as little as possible in the browser and most of the work on the server, even though I use PHP which is a very ugly language.

      • dmbarbour says:

        The idea of suspending state for mutually exclusive access is problematic for its own reasons. But controlling distribution of authority to state is fine.

        Then [RDP] is some kind of DSL and the techniques of general PL design don’t apply. I thought your comments about “no local state” and “no event model” were directed at general PL. My mistake.

        Saying “Reactive Demand Programming (RDP) is a language” is wrong in the same way that saying “Functional Programming (FP) is a language” is wrong. It’s a category error. That doesn’t mean they’re DSLs. They’re programming models, with associated paradigms. They may be supported or enforced by a language, which may or may not be general purpose. My comments about “no local state” and “no event model” are in fact directed at general purpose programming. (They aren’t about RDP, but are realized in RDP.)

        Most programmers spend most of their time programming the user interface, then handling the data (input, reports etc) and the smallest amount of time processing in a stateless processing environment. Any general language that ignores the first 2 of these 3 activities can only be marginally useful to that general programmer audience.

        RDP doesn’t ignore these first two activities. Indeed, RDP was heavily influenced by goals to support UIs in difficult environments (CSCW, ZUIs). And RDP does handle data, just doesn’t store it locally. But I think you’d be surprised how much code is stateless after you remove most non-essential state. The bulk of an application can be developed in RDP code.

        I fail to see how you can “empower” a programmer too much. It is their project after all.

        You seem to be assuming a single-programmer project that doesn’t use any libraries or frameworks developed by other programmers. If you ask yourself instead: “how much should the programmer of this framework/library be empowered” you might find a different answer, especially regarding the power of said programmer to make important decisions on your behalf.

        My philosophy is to provide as many power tools to the programmers as possible so that they spend more brain time on their project and less time (or no time) thinking about CS techniques and gotchas.

        I’m interested in avoiding gotchas and intellectual traps. Indeed, that’s what motivates my ‘no local state’ and ‘no event model’ arguments. But I think “to provide as many power tools to the programmer as possible” is potentially contradictory to those goals. I prefer to find a toolset that works well together and that remains under control.

        In my message passing system, I can handle any number of requests at the same time

        It seems by ‘at the same time’ you mean parallelism rather than any formal notion of ‘same time’. I agree that parallelism is valuable, and I don’t neglect it. RDP supports high levels of parallelism and concurrency.

        I genuinely wish you luck working with Javascript and all the slight differences in implementation on the various browsers. In the end, my solution was to do as little as possible in the browser and most of the work on the server

        I don’t plan to touch JavaScript directly, rather to compile a behavior down to JavaScript (probably the high-performance subset asmjs) in order to manipulate an abstract DOM and client-side state.

      • David Clark says:

        I haven’t detected any negativity about the quantity of my posting yet. My compliments to you.

        But I think you’d be surprised how much code is stateless after you remove most non-essential state. The bulk of an application can be developed in RDP code.

        I have made almost nothing but state filled applications for over 35 years. In fact, I think state should be the center of all programming but my intent isn’t to debate functional programming right now. Later maybe!

        “how much should the programmer of this framework/library be empowered”

        I agree that my experience is mostly as a single programmer but I also think most programming, even in a team setting, is broken up so that programming is normally a single person endeavour. Do you really believe that getting access to local variables where the source code to the functions working on that data isn’t available, is a good thing? Personally, I would never use a framework exactly because I wouldn’t have access to the source code. Just getting access to the data wouldn’t help. If I put my name on an application, then I must have access to all the source code in case something goes wrong. Clients don’t care whose fault it is, the problem is always yours to fix. Quite often it is difficult to find where the error is occurring with the source code but without it, you are hooped.

        But I think “to provide as many power tools to the programmer as possible” is potentially contradictory to those goals. I prefer to find a toolset that works well together and that remains under control.

        Yes that is what I said BUT the number of power tools isn’t as important as having a full set of power tools that will do the job, out the box, in most circumstances. Working well together is preferable, of course. “Remains under control”, sorry but I am not the kind of person that likes to be controlled.

        It seems by ‘at the same time’ you mean parallelism rather than any formal notion of ‘same time’.

        Actually I mean “concurrently” using multiple CPUs which is “at the same time” but not exactly parallelism as in same program on many CPUs. I have worked out ways of providing “parallelism” but I prefer it to be automatic when the system detects that that technique will help. Automatic general “Parallelism” won’t be in my first release but I believe it is important.

        I don’t plan to touch JavaScript directly, rather to compile a behavior down to JavaScript (probably the high-performance subset asmjs) in order to manipulate an abstract DOM and client-side state.

        A very wise decision but definitely not trivial.

      • dmbarbour says:

        Do you really believe that getting access to local variables where the source code to the functions working on that data isn’t available, is a good thing?

        It depends.

        If the variables represent essential state – i.e. state that exists because it is a natural part of the problem being solved – then I will generally desire external access to support extensions. If the state is accidental – e.g. a consequence of representing the solution in an eventful, imperative programming model – then I desire to eliminate the state because state is difficult to debug. The ideal end of those two forces is that all essential state is external state, and all state is essential, and therefore all state is external. While it is unrealistic that I’ll ever eliminate all accidental state, I’ve found there are benefits even in shifting the remaining ‘accidental’ state to external resources – not so much for “getting access” as for the peripheral advantages (orthogonal persistence, runtime code update, a more uniform consistency model and debugging experience, disruption tolerance).

        I would never use a framework exactly because I wouldn’t have access to the source code.

        This is a valid option, though it has its own costs. For example, I doubt I’ll ever be able to compete with the expertise embedded in OpenCV or the Unreal4 engine. And implementing something like SDK or another XML library would be a big hassle. And even having ‘access’ to the source code doesn’t mean much unless you can ‘control’ the source code (i.e. commit to it), which is often a problem for shared libraries that have lots of different users.

        the number of power tools isn’t as important as having a full set of power tools that will do the job, out the box, in most circumstances

        I agree. A PL or IDE should come with a variety of tools to get jobs done.

        “Remains under control”, sorry but I am not the kind of person that likes to be controlled.

        You misread me. I spoke of the tools remaining under control, not the programmer. “Powerful” tools often are difficult to control, and they can damage a product or user if not used with care, sometimes in non-obvious ways. Power is not the property I prize most highly. I am willing to limit some powers to achieve some features – e.g. limit exclusive control of state in order to protect extensibility and access.

      • David Clark says:

        We will have to agree to disagree on state being a problem. I am not familiar with OpenCV or Unreal4. I thought that OpenCV might be a programming version control system but Google says it is a Computer Vision library. If that is the case then I wouldn’t write my own either!

        My “remain under control” comment was just my trying to be funny and I do take these issues very seriously. Power tools to me are fully functioning Lists,tables,queue, stack, map,array, indexes etc with a few variations so that you don’t need to use the full feature List type if something small will do. All the power tools I have contain a “sort” function where that makes sense etc. Why reinvent the wheel.

        I think state should have exclusive control for all the reasons I have stated BUT I also see where exposing that internal state to outside functions under controlled conditions can be very useful. If libraries provided such a facility, that would seem to fit your needs as well. I don’t believe code where the state was designed to be private could be correctly modified (maybe if just read only) if the language just let you in the back door. It could work but only if the functions in the library were designed for that purpose up front.

    • dmbarbour says:

      I get the sense that your blog uses ‘local’ in a different sense than I use it in this article. Modeling spatial-temporal relationships of stateful resources (latencies, partitioning) is something I do extensively in RDP. I also leverage substructural types to model finite resources and exclusivity.

      It is feasible to model state in a purely mathematical manner – e.g. using incremental folds over an input stream while manipulating linear structure. But doing so remains problematic with respect to other essential software concerns – such as maintenance, extension, persistence, and safe process control. The consequence of local state is that we must hack the model to address these other problems.

  14. wtpayne says:

    I largely agree, although I would couch the argument in different terms, since I think the global-vs-local dichotomy might be an orthogonal matter.

    Our brains struggle to reason about how state evolves over time. Add in concurrency, and the problem easily becomes intractable. On top of this, testing stateful components is burdensome.

    So, the state needs to be kept as separate as possible from the complex algorithmic logic of the application, so that the state-handling-parts can be kept as simple as possible, and the complex parts can be kept as easy as possible to reason about and to test. If this means that the state is handled globally, then fine, but it is not really about where the state is held, but rather about how easy is it to reason about and test.

    My rule of thumb is this: we should be able to test our complex mathematical and algorithmic components as stateless (pure) functions, independently of any stateful parts of the application. The remaining stateful parts of the application should have a simple and well understood lifecycle, preferably well away from any concurrency, and with tightly controlled and documented state transitions. (OOP concepts are handy for this, although the OOP monster must be kept on a tight leash).

    • dmbarbour says:

      Well said. State should be external to the application (including OO objects), but it is important that we can reason about state. The object capability model and substructural types are each very useful for reasoning. E.g. we can exclusively bind external state to an OO object.

  15. redondo says:

    Global state is actually a singleton that functions as a model in an MVC architecture. The problem is how to model it in such a way that you don’t try to second guess yourself and beat yourself up on the object modelling aspect.

  16. Jim Smart says:

    What you describe here, i.e. typed-data values with pathed keys, is not unlike Window’s Registry.

  17. When you invoke a object class to create an instance, you are invoking a global shared, so all things with its merit.

    “local state is good, global shared state is bad” and all those (very easily over)simplistic kind of thoughts are like pain killers. They might alleviate you in a moment of affliction but they can be also be addictive beyond the point of benefit. In that regard, yes, something could be poison.

    Your line of thought here will make you converge to invigorate some kind of proceduralism. Sorry I don’t know what your domain problem is but you seem to be experiencing an object oriented overhead that you feel like starting to hurt.

    You can go ahead and proceduralize things (functions against a remote datastore) but I wouldn’t be so fast in questioning the object design fundamentals. I would try harder* to remove the original painful overheads or whatever real pain is by refactoring or redesign.

    *by harder I don’t mean to be muscular or that you aren’t paying effort to it. Harder could mean to do something as easy as asking to hacker friends to use their fresh unbiased view for a problem/code review.

    Listen all, pay attention to some, then ignore everybody (including me)

    best

    o/

  18. Sid Cypher says:

    I really like the idea!
    What would you think of a header library for lightweight in-memory filesystems?
    One that you can easily add to you project and make globally-accessible filesystems to hold your programs’ state and data.
    And for each you can toggle awesome features with a flick of a define flag. Sounds really nice.

    Do you, by any chance, know of existing things similar to that? The closest I found is talloc, and that’s not close enough.

  19. Isn’t this issue addressed in part by the the Smalltalk/GemStone Database systems?

    As far as I know, GemStone allows you to persist objects with their state in a database, so there is not translation between your programming language and your data, and all objects are ‘live’.
    (I am not affiliated with either in any way)
    http://gemtalksystems.com/index.php/products/gemstones/

    • dmbarbour says:

      Persistent object systems like Smalltalk, Gemstone, Hibernate, etc. do not address the update issues. I.e. if you change your program code in a manner that affects schema (how data is organized among objects), you cannot readily restructure data in existing objects to be consistent with the new code and schema.

      Also, persistence in Smalltalk, Gemstone, etc. are achieved from a privileged global position that violates abstractions and encapsulation, and is thus neither very precise nor safely extensible.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s