Shared-memory concurrent programming and verification traditionally assumes Sequential Consistency, but real-world hardware only provides relaxed consistency models. Furthermore, the consistency models of different hardware architectures vary widely and have often been poorly defined, while programming language models (aiming to abstract from hardware details) are different again. The IBM Power line of multiprocessors has long been recognized to have one of the most complex and subtle memory consistency models, and ARM multiprocessors have a broadly similar model. Together, they directly inspired features of the new C/C++ concurrency model in ISO C/C++11. However, it was an open question among the ISO concurrency committee whether the model even made sense, or could be implemented.
I will describe my work in devising mathematically precise abstract models for ARM and Power multiprocessors, which has shed light on hitherto unknown programmer-observable behaviour. The model was developed based on extensive experiments and on discussion with IBM and ARM architects; it is expressed as an abstract machine, abstracting as much as possible (but no more) from microarchitectural implementation details.
I will also describe proofs that show that a recommended implementation strategy for C/C++11 concurrency to Power is sound. All compilers will have to implement such a strategy, and furthermore, all have to agree on a common strategy for linking to be sound. This proof builds confidence in both models, and also provides insight into how the two very different models really work.