Purely Functional Performance

An argument I’ve encountered umpteen times, most recently on LtU, is: imperative programming performs better than functional programming because it has O(1) updates, while updating functional structures is typically O(lg(N)). I’m now weary of repeating the counterpoints so, like a good programmer, I’ve decided to refactor into a blog for future linking. Here goes:

In any physical system, memory is laid out in a 3-dimensional physical space. Consequently, speed-of-light delays between memory and any centralized processing unit increase by at least a factor of N^(1/3). Further, there is a lg(N) factor if we give every location in the space a unique address. Thus, if we account for known physical constraints, the very best we can theoretically do for centralized processing of a data structure is O(lg(N) * N^(1/3)). (Or maybe that `*` should be a `+`.) In practice, memory tends to be laid out in two dimensions (surface of a board, surface of the Earth) so the factor is actually N^(1/2).

Usually, we ignore these physical factors. However, they’re more difficult to ignore today than they were historically. Today, we have three levels of cache, NUMA, distributed shared memory models, using memory itself as cache for the disk, and so on.

If we do account for these physical factors, then functional data structures have the same asymptotic properties as imperative ones.

Conversely, functional structures can easily model physical memory – e.g. use a trie where every key is a 64-bit number. In this case, we can guarantee that the access and update is O(1) in the same asymptotic sense as is the case for physical RAM. Any update to the trie requires creating at most 64 new nodes.

So, no matter which perspective you take – physical or logical – functional code isn’t asymptotically worse than imperative code.

However, FP does tend to be less efficient in every relevant absolute sense. Allocating 64 nodes is a lot more expensive than in-place mutation of RAM in a 64-bit address space. So, I do acknowledge that performance is a concern.

Performance can be divided into a few different concerns, such as efficiency, utilization, scalability. Efficiency: do we use our resources well? Utilization: do we use all of our resources? Scalability: can we easily throw more resources at the problem? (or gracefully take some away?)

While FP isn’t doing so well with absolute efficiency, it can utilize resources and perhaps has greater potential for scalability (due to ease of parallelization, and location-independence of computations).

Further, due to structure sharing, FP performance tends to be more stable under external scenarios that might benefit from acquiring atomic snapshots of machine state: orthogonal persistence, time travel debugging, state visualization, live programming, rollbacks for exception-safe consistency, parallel reads and mirrored servers, etc..

So, while performance certainly is a concern, I believe FP fares very well – both asymptotically and absolutely – after we step back and look at the big picture.

This entry was posted in Language Design. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s