## Competitive eating

I've previously suggested parallel processing might have a touch of the infernal about it, and further evidence might be how it allows us to usefully indulge in one of the seven deadly sins, that of gluttony. Interestingly this link between concurrency and food was also explored in a famous problem in software design, that of the "Dining philosophers".  Here some hungry philosophers with insufficient forks must compete for utensils in such a way as to avoid deadlocks, race conditions and ultimately starvation.  This problem is a reminder of how tricky it can be to safely mediate concurrent access to resources.

Software developers have always had a responsibility to consider how system resources will be consumed.  Most operating systems aim for an equitable distribution via pre-emptive multitasking, but multi-processing architectures grant us the scope to create greedy routines consuming as much resource as quickly as possible.  In an environment with a heavy workload and lots of competing requests this is probably not the wisest move.  However with spare capacity and calculations that must complete in the shortest possible time, then (to paraphrase Gordon Gecko), greed is definitely good.

So, what in this context is greed? Well, the basic unit of processing in a modern operating system is a thread. A thread is a schedulable sequence of instructions, and it uses resources allocated to a containing process. When we run sequential programs, the operating system will typically create a process and thread to run the code on a single processor. If our thread uses the entire capacity of that processor, well, that is the limit of its throughput, and we'll just have to wait until completion. But if we architect our code to run across multiple threads or processes, suddenly we can consume more than 100% of a single processor, and finish in fraction of the time. The nature of the problem (and the quality of implementation) will typically determine how scalable our code can be across multiple processors, with an ideal of a near-linear reduction in runtime as we add threads.

In the early days we could think of a server in the context of a single physical processor, but those architectures were soon eclipsed by multiple processors sharing memory in an symmetric multi-processing (SMP) arrangement. More recently multi-core processors came on the scene, where each individual processor package contained its own SMP environment.  And many modern environments now go one step further, supporting simultaneous multi-threading via super-scalar CPU architectures where a single core can process more than one thread simultaneously. These single physical cores expose multiple virtual processors at the operating system level.  So, whilst we have to assimilate a number of levels of abstraction, for the developer who gets the coding right it can be an all-you-can-eat buffet.

Our most recent foray into the multi-processing diner saw us implementing a parallel version of our basis equating code, where we've seen a more than six-fold reduction in run-time across seven threads on a dedicated server. That time saving is returned directly to our clients, which has to be a good thing. And in the sometimes rigid discipline of actuarial modelling where parsimony and prudence are our watchwords, isn't it refreshing to know there is still a place for a little over-indulgence?

Assume we have a random variable, $$X$$, with expected value ... Read more