Natural Programming Language

A recent essay about why natural language isn’t computer code has been making a few rounds through the PL community. It seems to celebrate the precision required in computer languages, the differences between coders and writers. The article concludes:

Fundamentally, good writers are empathetic, human and stylish. Good coders are literal, formal and algorithmic. To take the coding-as-prose analogy too far is to misunderstand the essence of both.

I’d love to develop software that is mostly empathetic, human, and stylish, with just a few stilted algorithmic bits where I really need them. And while state-of-the-art languages don’t support it today, all the components necessary for it have been developed in different systems. Here is what I believe we can do to make programming languages more natural:

  1. Use probabilistic or weighted grammars, to allow for programs that contain possible errors.
  2. Leverage constraint systems and searches for code that achieves goals. This allows the user to underspecify the code, and the compiler to fill the gaps with something that is at least moderately sane. When we don’t like the compiler’s results, we refine our constraints, much like we refine searches to any search engine. This allows us to sketch our code and still achieve something useful.
  3. Support rough fitness heuristics – soft constraints and weights, cost-benefit analysis, a user model of known preferences. This allows us to push towards “better” models that fit the user’s intention.
  4. Leverage paraconsistent logics, which allow us to constrain the propagation of inconsistency. This can allow us to program by analogy and metaphor, without following those analogies “all the way down” to the extremes where they fail in silly ways. This could allow a much richer model for mixins, traits, compositions.
  5. We can develop semantics that reduce commitment to action, i.e. allowing users and developers to understand the consequences of their programs (not just locally, but in a real system) without committing to those actions, allowing opportunity for refinement. I.e. allow takebacks.
  6. Our programming models, IDEs, and UIs can better provide explanations of how they are interpreting the code. This allows users to know when the computer knows what the users mean, with less guesswork and greater confidence. In return, this refines a communication skill in users, who will learn quickly what to clarify up front and what can be left to search.
  7. We can extend to live and interactive programming, with a real dialog in both directions, where the computer itself can ask for clarification in the case of ambiguity or unknowns. Live, interactive programming is also a very viable approach to UI in the future age of ubiquitous computing.

No strong AI is necessary, though something like a domain-specific AI tends to be a natural consequence of any programming model that utilizes heuristics, searches, and a few caches or traces for performance and stability (which models learning).

I believe that, one day, even our written words won’t be static, opaque things only for humans to read. The documents we write will be live documents, capable of composition and gathering information resources, providing explanations, include interactive examples and training exercises, interactive fiction, etc.. But this won’t happen unless it is easy to write such documents – i.e. just as easily and imprecisely as we hack English today.

Advertisements
This entry was posted in Language Design, Live Programming, UserInterface and tagged , , . Bookmark the permalink.

4 Responses to Natural Programming Language

  1. John Shutt says:

    I’ve long believed that what makes computers really valuable is their precision, also what makes them really dangerous. Humans are valuable for their ability to deal with imprecision – not their imprecision, their ability to *deal* with imprecision. Key to their ability to deal with imprecision is that when precision gives way, there’s a human making value judgements based on their humanity. A computer dealing with imprecision would likely be the worst of both worlds (think HAL from 2001).

    That said, I agree there’s all sorts of interesting scope for melding rich (in one sense) computational infrastructure with rich (in a different sense) natural language.

    • dmbarbour says:

      Soft constraints allow humans to represent their policies and preferences – i.e. value judgements expressed by humans based on their humanity. Those policies would guide the decisions of a computer, in case of ambiguity.

      Also, we won’t have any AI overlords; rather, billions of purpose-specific or user-specific AIs each trying (within their available capabilities, communication and coordination protocols) to support their individual purpose. The AIs won’t even be very smart; they just learn what solutions are likely to work well, so try those first, with an emergent behavior of improving stability and performance in the system. The emergent behavior of the system might seem intelligent (i.e. rapidly finding solutions that satisfy the users, when such solutions exist), but it won’t be anything like HAL.

      Computers can deal with imprecision quite effectively. Just not always efficiently. A relevant question is how much we’d be willing to sacrifice performance to make our systems human-friendly.

  2. Richard Alexander Green says:

    It all depends on what you want to do.
    – An English cookbook recipe is a program. It is designed to be executed by an actor, a Cook. The cook is expected to have specific innate capabilities including the ability to make guess what we meant when we were ambiguous. At worst, the cook can make their own choice when they find our recipe ambiguous for some reason.
    – A machine control program is intended for a specific kind of robot. When we give it physical measures, the precision of those measures is often assumed.
    – A chat-bot program in AIML can have a stochastic or probabilistic response. The response may include the execution of some program the chat-bot has access to.
    – A language translation ontology (consider OpenCyc or a voice-to-text agent) typically has to be interpreted using probabilities.
    – As programmers, we have forgotten what a program is. We are like fish that are asked to reason about water. A traditional computer program is meant to instruct a programmable calculator. Can we get it to simulate something else. Yes. more-or-less.
    – Efficiency, in terms of machine resource is no longer relevant. Machine cycles and storage have become so cheap that they are being given away in return for our momentary attention. We are in a potlatch system.
    – What does matter is our attention time. How long does it take to write the lines of code to create some effect? How long does it take to find and correct or enhance that code in the future? How long does it take for another person to read that code and learn from it?

  3. Gerry Rzeppa says:

    Some years ago my elder son and I developed a Plain English programming and development system in the interest of answering the following questions:

    1. Can low-level programs (like compilers) be conveniently and efficiently written in high level languages (like English)?

    2. Can natural languages be parsed in a relatively “sloppy” manner and still provide a stable enough environment for productive programming?

    3. Is it easier to program when you don’t have to translate your natural-language thoughts into an alternate syntax?

    We can now answer each of these three questions, from direct experience, with a resounding “Yes”.

    Our parser operates, we think, something like the human brain. Consider. A father says to his baby son:

    “Want to suck on this bottle, little guy?”

    And the kid hears,

    “blah, blah, SUCK, blah, blah, BOTTLE, blah, blah.”

    But he properly responds because he’s got a “picture” of a bottle in the right side of his head connected to the word “bottle” on the left side, and a pre-existing “skill” near the back of his neck connected to the term “suck”. In other words, the kid matches what he can with the pictures (types) and skills (routines) he’s accumulated, and simply disregards the rest. Our compiler does very much the same thing, with new pictures (types) and skills (routines) being defined — not by us, but — by the programmer, as he writes new application code.

    A typical type definition looks like this:

    A polygon is a thing with some vertices.

    Internally, the name “polygon” is now associated with a type of dynamically-allocated structure that contains a doubly-linked list of vertices. “Vertex” is defined elsewhere (before or after this definition) in a similar fashion; the plural is automatically understood.

    A typical routine looks like this:

    To append an x coord and a y coord to a polygon:
    Create a vertex given the x and the y.
    Append the vertex to the polygon’s vertices.

    Note that formal names (proper nouns) are not required for parameters and variables. This, we believe, is a major insight. My real-world chair and table are never (in normal conversation) called “c” or “myTable” — I refer to them simply as “the chair” and “the table”. Likewise here: “the vertex” and “the polygon” are the natural names for such things.

    Note also that spaces are allowed in routine and variable “names” (like “x coord”). This is the 21st century, yes? And that “nicknames” are also allowed (such as “x” for “x coord”). And that possessives (“polygon’s vertices”) are used in a very natural way to reference “fields” within “records”.

    Note, as well, that the word “given” could have been “using” or “with” or any other equivalent since our sloppy parsing focuses on the pictures (types) and skills (routines) needed for understanding, and ignores, as much as possible, the rest.

    At the lowest level, things look like this:

    To add a number to another number:
    Intel $8B85080000008B008B9D0C0000000103.

    Note that in this case we have both the highest and lowest of languages — English and machine code (albeit in hexadecimal) — in a single routine. The insight here is that (like a typical math book) a program should be written primarily in a natural language, with appropriate snippets in more convenient syntaxes as (and only as) required.

    You can get our development system here: http://www.osmosian.com/cal-3040.zip . It’s a small Windows program, less than a megabyte in size. If you start with the PDF in the “documentation” directory, before you go ten pages you’ll be recompiling the whole shebang in itself (in less than three seconds on a bottom-of-the-line machine from Walmart).

    Questions and comments should be addressed to gerry.rzeppa@pobox.com

    Thanks.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s