|
![]()
ToleRace Race conditions are memory errors that occur when multiple threads read and write a memory location in an under-specified order. Because race conditions depend on the interleaving of the memory operations of individual threads, they are notoriously difficult to reproduce and represent a major obstacle in the task of writing correct concurrent programs. Current approaches dealing with race conditions focus on the problem of race detection and face three significant obstacles:
ToleRace is the first system to tolerate races in a computing environment. Instead of solving the general problem which in many ways can be considered ill-defined due to semantic underspecification, we focus on the case when a programmer writes correctly a critical section in her thread. We would want to enable a program execution mode in which this critical section will never observe a race, i.e., it would tolerate accesses to shared variables, due to other, erroneous threads. We achieve this objective using a simple data replication procedure and a race-detection test at the exit from a correctly specified critical section followed by an error correction stage. We do not alter, instrument, or change in any way the potentially "malicious" instructions in the remainder of the executable. Depending on the level of support from the hardware, ToleRace can tolerate many (checkmarked in the Table below) and detect the remaining asymmetric races (with support for atomic copying) or even enable race-free execution of correctly written critical sections (more complex but still simple hardware support). All claims related to the ToleRace oracle are specified in [1]. To the best of our knowledge, our work is the first to enumerate all possible interleavings that cause races in a bi-threaded environment (as per Table below), define the objective of asymmetric race-free execution, and provide software and hardware (only roughly, currently working on a specific hardware design) support for achieving this objective. The benefits of our system with respect to transactional memory [2] are substantial: no need to rewrite legacy software and significantly simpler software-hardware support. The latest experimental results related to a software implementation of ToleRace, are documented in [3].
|