Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
A Scalable Joins Library

Scalable Joins is a CLR 4.0 library for declarative, scalable parallel synchronization.

Background

With multiple cores, even mainstream programmers need to write parallel code.  Those parallel threads need to communicate and synchronize.

Writing correct synchronization code is hard; achieving performance that scales with the number of cores is a black art.

Scalable Joins is a library that extends C# and other Common Language Runtime languages with declarative, high-level synchronization constructs.

The model is based on the message passing join calculus:

· Objects can declare channels on which to receive messages and requests;

· Methods are declared to react when sets of channels are filled.

Threads coordinate by sending messages. The ideas are simple, yet powerful.

The library is a new, “lock-free”  re-implementation of the lock-based Joins concurrency library, designed to scale on parallel hardware. As before, programmers write simple, high-level code but now they also get the parallel scalability of complex, low-level code previously crafted by experts.

Example

Consider E. Dijkstra's classic Dining Philosophers problem. Declarative join patterns, provided by our Joins library, make it almost trivial to solve - all we have to do is state the desired constraints:

var table = Join.Create(2 * n); 
 
// channel arrays for requests and resources
Synchronous.Channel[] hungry; table.Initialize(out hungry, n);
Asynchronous.Channel[] forks; table.Initialize(out forks, n);
for (int i = 0; i < n; i++) { 
 var leftFork = forks[i]; var rightFork = forks[(i + 1) % n];
 // a join pattern
 table.When(hungry[i]).And(leftFork).And(rightFork).Do(() => {
    // eat ... 
    leftFork(); rightFork(); // replace forks 
    // think ...
 });
}
 
// set the table 
foreach (var fork in forks) fork();  
 
// spawn the philosophers
for (int i = 0; i < n; i++) { var _i = i; 
  var phil = new Thread(() => {   
    while (true) hungry[_i](); // request to eat
  });
  phil.Start();
}
 

This code represents the resources of the problem (the forks) as asynchronous channels. A resource is available if, and only if, there is a message on its channel. When hungry, philosopher i makes a synchronous request on his dedicated channel, hungry[i]. A request on hungry[i] will wait until or unless the philosopher's left and right forks are available. Once done eating, a philosopher releases his forks by invoking their channels, replenishing the resources he just consumed.

Performance

On a 24-core machine, using a benchmark derived from the above code, the original lock-based solution doesn't scale beyond 10 cores, while our new lock-free implementation fares much better, competing with the absolute performance and  scalability of a bespoke solution (Dijkstra's solution using fine-grained locks).

   

Here, each philosopher (thread) spins 50 iterations eating (with forks) and 10000 iterations thinking (without forks). In total, philosophers eat 100000 times.

Another benchmark, measuring absolute runtimes, has the philosophers doing  no work eating or thinking, thus measuring the cost of synchronization itself:

 

This graph demonstrates the increasing overhead of the old, lock-based implementation as we add more threads  - the new, scalable implementation does much better.

Links

Microsoft internal users can access a tech-preview release of the library, complete with samples and documentation http://toolbox/scalablejoins.

draft paper describing the implementation and preliminary performance evaluation is publically available here.

Credits

This is joint work with Aaron Turon from Northeastern University.

People