VCache, an Acid-State Killer

Haskell has a convenient package called acid-state, which enables a Haskell value to be utilized as a database, essentially via the command pattern: updates are logged, and programmers can perform the occasional checkpoint; on restart, the checkpoint is loaded and updates are applied. In Haskell, treating the database as a value is convenient because it allows developers to build the domain model and operations upon it from pure functions, which are easy to reason about.

However, acid-state doesn’t scale nicely. Large checkpoints take a long time to load at startup. Large values in memory create a large burden for the Haskell garbage collector, which must occasionally perform a full sweep. Also, acid-state doesn’t compose very nicely. If you have two acid-state resources, there is no way to update both of them atomically. If you have some volatile storage for the particular process, you cannot easily keep it consistent with the acid-state resource.

Haskell has plenty of support for other databases, of course. So we can scale by switching to one of those. But this means accepting a lot of incidental complexity and semantic noise, giving up much simplicity from acid-state.

I want the simplicity of acid-state AND scalability. Oh, and composition would be nice, too.

To that end, I am currently developing VCache. The primary innovation of VCache is to support a clean separation of value references of kind VRef vs. persistent variables of kind PVar.

A VRef is a handle to an immutable value that is primarily represented in an auxiliary address space. Importantly:

  1. VRef’d values may contain VRefs, supporting tree structures and DAGs.
  2. Construction of a VRef and dereferencing values are pure computations.
  3. Dereferenced values are usually cached to reduce serialization overheads.
  4. VRefs are tracked via weak refs, and aren’t GC’d from the auxiliary space.

VRefs can represent very large domain models without overly burdening system memory. Note that these models don’t need to be persistent. Use of VRefs to create very large, volatile models may be useful for some applications, e.g. for the same reasons you might want a temporary filesystem. Volatile values arise naturally with lazy evaluation, optimistic concurrency, constraint solving search, and so on.

A PVar is a variable named by bytestring, and may be loaded from one execution to another. The content of a PVar is the same class of values as a VRef, and thus PVars can contain very large domain models – much larger than system memory – assuming developers structure them appropriately.

PVars are implemented as thin wrappers around STM TVars, and may be updated transactionally with other PVars and TVars (arbitrary STM operations may be lifted into a PVar transaction). The resulting transactions are atomic, consistent, isolated, and optionally durable. (Durability simply involves waiting for a synchronization after running the transaction.)

Supporting STM transactions together with persistent variables is very convenient for ensuring the volatile elements of the application are consistent with the persistent elements. The transactions also support composition and extension – i.e. it’s easy to add new persistent features without trying to modify a global datatype. Optimistic concurrency does come with a few weaknesses of its own, of course. Large transactions may ‘starve’ if small, fast transactions repeatedly manipulate variables used by the larger transaction. But I believe this concern is easy to address at the application layer, e.g. by constructing queues and cooperation patterns around the more contentious resources.

The VCache auxiliary address space is backed by LMDB. LMDB is a a read-optimized embedded key-value store that utilizes MVCC to ensure readers never wait on writers. LMDB is limited to one writer at a time… but, within that limit, it at least writes efficiently. Usefully, VCache mitigates LMDB’s few weaknesses. The Haskell STM layer easily supports concurrent read-write transactions. The VCache will amortize synchronization costs by combining transactions that occur around the same time. Lazy construction of VRefs may avoid some writes entirely.

The VCache garbage collector uses reference counting at the LMDB layer, and operates incrementally and concurrently in a background thread. As our LMDB file scales to gigabytes or terabytes or larger, we never need to sweep the whole thing. When the application is restarted, the garbage collector can eliminate values that were held only by volatile VRefs on the previous run. Usefully, LMDB covers the whole compaction issue as part of maintaining the underlying B-trees.

With regards to value caching, programmers will have some heuristic control over cache behavior on a per value basis, and the ability to roughly indicate roughly how much memory to use (albeit, based roughly on LMDB encoding rather than actual Haskell memory consumption). Values are also organized in VCache such that, at least for smaller values, structures allocated at around the same time will typically be on the same pages in LMDB. This allows effective use of smaller values.

VCache performs structure sharing at the VRef layer. This is useful because it saves space, because it may save some writes to disk, because it simplifies some forms of reasoning (e.g. about space costs for idempotent updates to a historical database), and because it allows comparisons of VRefs without loading the underlying values – e.g. such that we can efficiently perform a ‘diff’ on large trees that are mostly the same. When not used, the performance and space overhead should be is minor. I feel that structure sharing is one of those features that’s easiest to leverage when it’s pervasive.

My plan is to use VCache to support my Wikilon project.

Wikilon has an unusual ‘file’ system: files consist of bytecode and represent purely functional objects. That bytecode may contain values. But, for very large values – e.g. a tree map with a thousand elements – I’d rather avoid loading and parsing the bytecode to construct that value in memory. Instead, I’ll include a simple escape in the bytecode to access the value as a VRef. This allows me to preserve the O(lg(N)) updates or queries on the map, as opposed to the O(N) serialization costs for the full map. Further, that map could easily be part of a query response or function input. VCache will make it possible to treat persistent objects in a first-class manner.

VCache is not ready for use. And it won’t see much work over the next couple weeks (I’m on vacation). But I’ve hammered out most of the details, I’ve written the Haskell bindings to LMDB, and I’ve started implementation. I’m excited about this. I think, if you’re a Haskell developer or are interested in creating very large Haskell apps, that maybe you should be excited about it, too.

This entry was posted in Concurrency, Modularity, State. Bookmark the permalink.

8 Responses to VCache, an Acid-State Killer

  1. saizan says:

    What do you mean exactly by structure sharing at the VRef layer?

    • dmbarbour says:

      VCache collapses all values with the same serialized representation, such that they have the same address. Everything is ‘interned’. VRefs with the same type may be compared for value equality based on address.

      This feature relies on content addressing, via an additional map of hash→[address], implemented as an LMDB database. Originally, I contemplated storing values directly in the hashmap, but decided against this because it scatters reads across the entire address space and lacks nice caching behavior.

      • saizan says:

        I wonder if this would be suitable for manipulating syntax trees when typechecking dependently typed languages.

        It might allow to cache all the possible reductions of a term, by using the VRef’s as keys for a memoization table.

        Having the actual data stored offline might be too slow though, even if a major bottleneck seems to be memory atm.

  2. Pingback: Haskell VCache 0.1 | Awelon Blue

  3. Pingback: Wikilon Dictionary Applications | Awelon Blue

  4. Pingback: Wikilon Performance Strategy | Awelon Blue

  5. baa dg says:

    Haskell has bindings to Kyoto Cabinet and other key/value databases. KC in very fast and safe, didn’t you consider such option?

    • dmbarbour says:

      Yes, I considered Kyoto Cabinet. And LevelDB. And Rocks. And a number of other key-value databases. I remember spending a lot of time studying benchmarks, transaction models, etc. before settling on LMDB. At this time, it has been a few years and I cannot recall all the details that led to my final decision.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s