Progress Report: Loops and Restarts in Sirea

I’ve been working on Sirea quite a bit in the last few months, albeit making less progress than I had hoped. Mostly, I’ve been running in circles.

In Reactive Demand Programming, all loops are indirect, via a shared external resource (e.g. shared state or a demand monitor). A simple behavior that might loop is: monitor >>> bfmap maxPlusOne >>> bdelay 0.1 >>> demand, using a demand-monitor resource. The delay of 0.1 seconds means the loop logically runs at 10Hz. (A loop without delay is potentially divergent and usually a bug.) In Sirea, a loop must be broken across steps in a thread. This means the thread hosting this behavior must physically be running at an average rate of at least 10 steps each second, though the physical frequency may be decoupled from the logical frequency (e.g. running in bursts).

At the time of my progress report, I had loops working robustly, but not efficiently. I achieved this by making a conservative estimate: if a resource could possibly be in a loop, I treated it as though it were in a loop. I.e. `demand` always broke updates across steps, even if there was no local cycle. Breaking updates across steps increases update latency and risk of rework downstream (non-optimal update orderings, more visible intermediate values). Ideally, I’d want to do as much work in each step as possible.

In what may be a fine example of foolishly premature optimization, I’ve over the last few months been modifying Sirea to support thread-local cycle detection and making a few other tweaks. This required a big, painful, sweeping overhaul.

Partition-local cycle detection is now achieved dynamically at each step: each resource that could be part of a cycle sends out a pulse for cycle detection. If it hears its own pulse then it is part of a cycle. Each cycle only needs to be ‘cut’ at one point – the first to detect it – which allows pipelines within a greater cycle. This works alongside the `touch` mechanism, which warns of impending updates to achieve a near optimal update ordering within each partition thread. (Cross-partition cycles are implicitly broken across steps, just like everything else that crosses partitions.)

An issue I found while doing this work regarded what happens when the process is frozen by an external source, then restored. I work with a laptop that suspends automatically when I close the lid, so I experienced a phenomenon where – upon opening the lid – I’d get a sudden burst of CPU for several seconds while the system tries to ‘catch up’ with real-time. I had modeled this by basically sending a signal that says: “oops! retroactively correct the last few hours; we were inactive at the time!”. (It does this if it detects a jump more than a few seconds.) But this correction does not fix stability within the loop. The problem: a loop with delay 100ms can only increase its logical stability 100ms per full cycle. After being suspended for an hour, I’d need to perform 36k full cycles for stability to catch up. That’s a lot of processing to happen all at once, though it wouldn’t have been a problem if it was distributed over the hour.

I’m currently considering ways to either mitigate this issue or fully correct for it. One way to mitigate the problem: increase stability by a small fraction (e.g. 0.25%) of the difference between stability and update, when the difference is greater than a few seconds. This would reduce the number of steps to a logarithmic function of time. A potential way to correct the problem: make the resources a bit smarter for this particular issue, able to see when there is a huge period of inactivity.

Anyhow, that’s what I’ve been working on for Sirea. Hopefully I’ll get to more interesting stuff (e.g. sound, UI) soon.

This entry was posted in Modularity, Reactive Demand Programming, Stability. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s