UI as an Act of Programming

I’ve been thinking quite a bit, recently, about the relationship between UI and programming. A programming language is certainly a user interface – but usually a poor one from a human factors standpoint. And a UI can be considered a PL – a way for a human to communicate to a computer… but is usually a poor one with respect to modularity, composition, safety, extensibility, and other properties.

What I would like to see is a unification of the two. To me, this means: We should be able to take all the adjectives describing a great PL, and apply them to our UI. We should be able to take the adjectives describing a great UI, and apply them to our PL. Manipulating the UI is an act of programming. Programming is an act of UI manipulation.

Conal Elliott’s concept of Tangible Functional Programming seems related to what I want. Bret Victor’s Drawing Dynamic Visualizations seems related to what I want.

But neither of those is general purpose.

I need something that works for controlling robots, developing video games, reinventing browsers and web services, interacting with a 3D printer, implementing an operating system or a device driver. I want support for open systems. Different problem domains can have different domain-specific languages – which should correspond to domain-specific UIs. However, these DSLs must integrate smoothly, and with a common paradigm for basic communication and data manipulation. Can we do this?

In August, I developed a simple but powerful idea: a user-model – hands, navigation – represented within the static definition of the program. This, coupled with the notion of ‘rendering’ the environment type, has me thinking UI/PL unification is feasible. The user-model is not a new idea. Neither even is the idea of integrating the user-model with the program. It has been done before: e.g. in a MOO, or in ToonTalk.

The only ‘new’ idea is taking this seriously, i.e. with an intention to support open systems, down-to-the-metal static optimizations, a basis for UI and mashups, and general purpose programming. I think the different mindset would have a significant impact on the design.

Most serious PLs treat the programmer as a non-entity – ‘above’ the program with pretensions of omniscience and omnipotence. Even graphical PLs do this. As a consequence, there is no direct interaction with higher level types, software components, dataflows, composition. There is no semantic ‘clipboard’ with which a programmer can wire variables together. Instead, interactions are expressed indirectly. We look under-the-hood, investigate how components work, from where they gather their data; we perhaps replicate some logic or namespace management. The barrier between syntactic and semantic manipulation is painful, but is not a crippling issue for “dead programming” languages. But UIs are live. They have live data, live values, and there is a great deal of logic underlying the acquisition and computation of those values. They have live buttons and sliders, which may correspond to capabilities controlling a robot. In many senses, peeking ‘under-the-hood’ for UIs should correspond to reflection in a PL – an alarming security risk and abstraction violation, not something we should need to depend upon. Instead, we should be able to treat time-varying data, buttons, sliders as semantic objects – signals, parameters, functions, capabilities.

Users can navigate, grab, copy, create, integrate. Users can construct tools that do so at higher levels of abstraction. The syntax becomes implicit – the gestures and manipulations, though perhaps corresponding to a stream of words in a concatenative language.

To unify UI and PL, we need our programmers to be part of our PL, just as our users are part of our UI. We simply need to do this while taking the language and UI design seriously.

About these ads
This entry was posted in Language Design, Types, User Interface, UserInterface. Bookmark the permalink.

14 Responses to UI as an Act of Programming

  1. John Shutt says:

    Years ago I remember reading an editorial (could even have been in BYTE magazine) pointing out that spreadsheets had become programming languages behind our backs, evolving from user interfaces so that they completely missed all the lessons learned over the decades about how not to design a programming language. Resulting in people who “don’t know how to program” doing amazing things with spreadsheets via “macros”. The complementary problem is PLs failing to learn from UIs. Verily, we need overall vision on the unity of these things.

  2. Dave Orme says:

    John’s point about spreadsheets resonates. Spreadsheets are essentially functional-reactive programming languages for data and calculations. Smalltalk was designed to be exactly what you’re describing–there is no distinction between runtime and design/compile time in Smalltalk. I keep meaning to learn Squeak for exactly this reason. Lisp… But to be practical a system like this needs to integrate with and interact with all the other universes of programming languages and as you point out, gestures and affordances have never really been unified with programming language semantics.

    I agree with your thesis. Deeply.

    • dmbarbour says:

      I think Smalltalk developers had the vision but failed to formalize or achieve it. Instead, MVC was developed, separating users from the model, and further separating users from the code that separates them from the model.

      RE: “no distinction between runtime and design/compile time”

      I favor support for ad-hoc staging in PLs, i.e. not one big ‘compile-time’ but a lot of small, obvious opportunities for compilation. My RDP paradigm was designed to make this easy. Implicitly, any interaction between a relatively slow-changing signal and a relatively fast-changing signal can be seen as a staging opportunity (via specialization). Explicitly, staging is expressed by dynamic behaviors.

      Staging can be very valuable for open systems. (I.e. scripts, extensions, plugins, configuration data, resource discovery, database schema – such things can often be “compiled in” at one stage or another.) Staging can offer useful properties similar to type-safety, i.e. you can’t necessarily guarantee the code will be ‘safe’ before you try to install it, but if it was unsafe it will atomically fail to install.

      (Thought: dependent types might enable safe install of dynamic code, but I’m not sure how to get dependent relationships into the user experience while keeping things easy to visualize and compose.)

  3. emd says:

    With you all the way on the PL/UI synthesis. I’d even take your remark “PL – a way for a human to communicate to a computer” so far as to say that many popularly imagined forms of AI could probably be solved by a PL with a sufficiently “natural” grammar. However, there is one thing that has been troubling me about sufficiently radical reformulations of the programmer-editor-code programming model, and I think this post has helped me put my finger on it: I do not have a good sense of where “domain specific” ends and “general purpose” begins.

    When you describe your named-stacks + user-hands programming environment, what kind of programming are you envisioning the user doing? Intuitively, I understand your usage of “general purpose” to mean that the paradigm can be useful across a spectrum of traditional programming domains (web, os, robotics etc), but since it strikes me that the affordances of a VR programming environment would potentially encourage a very different conception of programming from traditional text-based programming environments, are you inclined to view the VR model as an augmentation of text-based? A tool for a different concern (such as the creation of UI/DSLs)? Or one of many possible UI/DSLs alongside text-based and other forms of programming aimed at a specific type of spacial or representational concern? When do you envision a programmer thinking to themself “aha, this would be a good time to use my RDP VR editor instead of emacs?”

    Thanks.

    • dmbarbour says:

      I’ll first answer your final question: the ease-of-programming wouldn’t be the only consideration, and I think it wouldn’t even be a primary consideration. After all, if PL/UI are unified, then the UI interactions are themselves a significant part of the experience. A user – a potential programmer – will be involved with the UI for some purpose other than programming. But if an act of programming (especially in the more concrete form) is really cheap, then users will program things as part of expressing themselves. That is the goal of this synthesis.

      Regarding the environments: I would use different environment-models for different visualizations. But I would still keep everything backed (and formalized) by text. In Awelon, switching between different environment models is mostly an issue of library choice.

      For the “named-stacks + user-hands” environment, I envision something closer to a mixed text/graphics desktop IDE. Developers are focusing on one ‘stack’ at a time, and it is rendered prominently. Other stacks, hands, etc. may be rendered in the periphery. They can use text or directly manipulate the graphics. The IDE provides suggested factorings, perhaps helps developers build loops. In this case, the bulk of the environment is real content.

      An environment for Augmented Reality would need to be different. For every ‘real’ value, there’d be a huge mess of UI values such as visual fingerprints for bindings, orientation, size, potential meshes. Depending on how much is procedurally generated, perhaps 95-99% of the content in the compile-time environment would be UI related. But even this UI content would still be manipulated directly as part of the programming environment – e.g. duplication of objects, binding to new visual surfaces, rebinding parts of the environment that aren’t visible, reorienting and grabbing objects, etc.. A stream of text might be generated in the periphery of the user’s vision, indicating how gestures are being interpreted or how history is being ‘rewritten’ to eliminate or smooth out the irrelevant intermediate states. (Some use of ‘exponential decay of history’ seems like it might be useful here.)

      An environment for virtual reality would probably be somewhere in between. Visual fingerprints aren’t necessary, but we still need meshes, locations, orientations.

      In all cases, a good programming environment could be keeping an eye out for patterns in the stream of words. These could be opportunities for macros, loops, programming-by-example. One of the big advantages of keeping everything formalized with text is that metaprogramming becomes very accessible. The programmer model has the exact same authorities and abilities as the programmer, but can be instructed to build things that the programmer lacks the patience to accomplish by hand. So, programming the programmer is one of the better opportunities for UI extension.

      Interestingly, software components developed in any programming environment would be usable in the others. This works because software components rarely have more than a few real inputs. Inputs may need to be rearranged and wrapped a little, but that’s easy to automate.

      I’m sure you’ve noticed that I’ve said nothing about DSLs so far. I consider the whole text/VR/AR environment to be pretty much orthogonal to DSLs. It is certainly possible that an AR or VR or text environment is better for certain domains or problems. But I think in many cases the advantages might not be sufficient to overcome the physical and emotional hassles of switching environments.

      In general, DSLs operate at a higher level than basic manipulations. DSLs represent different content, different ontology. In the named-stack IDE, we might have ‘DSLs’ for different document-like structures: text manipulation vs. diagrams vs. dialog trees vs. widgets. Some operations (zippers, traversals, transclusion) might work generically across a broad variety of documents, while others are very specific to the document type. In a VR, we would similarly need DSLs – for documents, but also for scene-graphs, for simple animations, for suggested interaction rules, and so on. For AR, we might want a concept of 3D widgets to control smart-homes, and we might want rules to play eerie music when we look into dark closets at night.

      In practice, I think we might identify DSLs with tools instead of text, especially in an AR or VR environment where direct manipulation of text is difficult. We can represent a sword, paintbrush, chisel, mattock, wand, a keyring, and so on to help interpret our gestures and translate them to actions on the environment in a more problem-specific way. Several of our actions then become related to switching tools to switch problem domains. (The concept of ‘tooling’ the hand has certainly been around: it’s used in MOOs, ToonTalk, PaintShop, etc.). An interesting possibility is to use actual haptic tools rather than just virtual objects – though obviously we’d need to use a foam Minecraft mattock instead of the real thing. :)

      • emd says:

        Thanks, that clarified a lot. I think I had grafted a few ideas that were to you orthogonal onto the same elephant and was consequently having trouble discerning its shape. I am very much aligned with the idea of programming flowing naturally out of interaction, and I am wondering how you might view the line (or perhaps continuum) between developer and user. Going from my understanding of your descriptions and a bit of speculation I might imagine a developer A that writes (in text) an environment for constructing MOOs, a developer B that enters the MOO environment as a player with administrative powers (creating rooms and objects but also walking between them and picking them up, respectively), and a player C that enters the very same environment that A developed, just without the administrative authority (ie walking between rooms and picking up objects, but creating neither). Does that resonate at all with how you are envisioning the relations between developers, users, and environments? Are the developer and user environments more separate in your conception? It seems to me that passing an amount of constrained programmatic control from the base system to the end user could help to ameliorate the reinvention of ad hoc programming models in every new interface.

      • In my vision, there would be two primary differences between players and administrators. First, is intention. Obviously the player is there to play. The second is power.

        Power, in Awelon, will be modeled directly within the language – i.e. in terms of capabilities, sealer/unsealer pairs. Capabilities are a much more precise and expressive basis for security than identity.

        A game developer has no ability (and no right!) to curtail the powers of other humans. However, a game developer can certainly control which powers are granted to other humans. So there will be a box-of-capabilities associated with becoming a player of the game, and a much bigger box-of-capabilities granted to the administrators.

        Player C can always create new objects, but is limited to doing so under his or her own authorities – which now include ‘player’ caps… which might not permit injecting ad-hoc objects into the “game environment” where other players can see them. (Barriers can be modeled using sealer/unsealer pairs.) But a player can still construct macros and view-extensions, bots. A really clever player might figure out how to treat his avatar as part of a larger game. :)

        Interestingly, security types are also useful within a game – e.g. using sealer/unsealer pairs to model locked rooms/containers and keys, capabilities to model special weapons, affine types to model ammunition, and so on.

        A game administrator needs a tough barrier to control environments and enforce ‘game rules’. Similar concerns apply to social sites. But I think many systems could be reduced to programmable APIs. Augmented reality is also interesting: it seems to me that physical proximity can often serve as a reasonable basis for acquiring capabilities to simple devices (like an alarm clock), or a handshake might be a basis for granting some minimal set of voiced capability.

        Returning to your questions: I don’t really see any need for “separate environments” at all. Even the game is part of a larger environment, just with a barrier to protect certain in-game rules. I’d ultimately like “the programming environment” to be singular, just like “the Internet”. The ability to enforce barriers and control distribution of power is, of course, critical to make this happen.

      • emd says:

        I had your discussions of authority/ocaps/sealers/types firmly in mind when conceiving of the example. What you describe is about what I expected based on those, and a very tantalizing proposition. Thanks for the discussion and keep the insights comin’!

  4. CuriousSkeptic says:

    On the related things side, make sure to check out Jeff Raskins “Humane Environment”. It’s a ui designers view of how to build systems to empower the user more.
    F.ex he argues that the concept of applications (silos of functionality) should be abandoned and replaced with a document +tools focused model.

    • dmbarbour says:

      Thanks for the suggestion. I’ve put the Human Interface book on my wishlist. (I still intend to get through Edward Tufte’s work and a few others first.)

      Another related article that I found very shortly after writing mine is Paul Chiusano’s blog: End of Apps.

      • John Nilsson says:

        Yeah, Tufte is great, quick read too :) On the topic of UX in general: Donald Normans Design of Everyday Things is a must if you haven’t read it yet. A warning though, navigating through the world becomes a really frustrating experience after reading it ;-)

      • John Nilsson says:

        Hmm that End of Apps link stirred up old dreams :-) feel rally inspired to contribute to this vision.
        One thing that struck me (reading about type theory) is that type classes are probably one of the key ingredients on a social scale.
        I think one of the conclusions that can be drawn from the Scala experiment of a OOP/FP hybrid is that behavior really shouldn’t be tied to objects in the form of methods, and that type classes provides a better means of providing behavior according to some contract. (Btw you might find this discussion on the topic interesting: https://groups.google.com/forum/#!topic/scala-debate/QTo_n3U9B0k which starts with me arguing for dynamic dispatch of typeclasses and ends with an interesting discovery about a flaw in scalas co-/contravariance directions when dispatching to an implicit type class)

        With type classes we have a three way split between the objects acted upon, the contract of the actions, and the implementation of those actions for a given context. Which means that we can have independent innovation and implementation of all three areas, not least important, by different groups of people with different interests.
        Incidentally this is a strategy that seems somewhat adopted by Microsoft for the .Net platform, even if they employ ad hoc polymorphism due to limitations in the language the method is the same. Interfaces, like IEnumerable and IObservable are the contract. A library of extension methods (a type class) decorate those interfaces with behavior, f.ex. Reactive Extensions, and then people implement the interface for various objects. Scala improves on this by letting a fourth party implement implicit mappings from concrete types, to types implementing the interface. While I think Rx is still the only implementation of it’s type class, there are numerous LiNQ providers already.

        So my point is, I think dynamically dispatched type classes could be a good component in an environment where people need to discover and create new content types, and new ways to operate on it, and new ways to generalize and map old types and behaviors, in a collaborative way.

      • dmbarbour says:

        Yeah, I also think some variation on typeclasses could be a very useful basis for generalizing behaviors and observations on sealed values.

        For an open system, of course, it’s essential to address the problems of: “how do I find the typeclass instance in the environment?” and “how did it get there?” and “how do we securely maintain these instances?” Haskell hand-waves over this. I think Awelon’s formalized environment model should make this much more precise and accessible.

        If you really want to get your hands dirty on this subject, I have started a dedicated thread in the reactive-demand google group. I have been contemplating how to unify humans with other agencies in this PL/UI, especially with respect to live and staged programming.

  5. Matt Carkci says:

    Hello, I am the author of an upcoming book called “Dataflow and Reactive Programming Systems” on Kickstarter now http://www.kickstarter.com/projects/1712125778/dataflow-and-reactive-programming-systems.

    I feel it will be a beneficial addition to the sparse literature out there on dataflow because I intend to explain it in simple, practical terms. Please take a look and see if your readers might be interested in learning about it.

    I’ll be happy to answer any questions you may have.

    Thank You,
    Matt Carkci – Author

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s