How to Build a Highly Available System Using Consensus

  • Butler Lampson

10th International Workshop on Distributed Algorithms (WDAG 1996) |

Published by Springer

Editor(s): Ozalp Babaoglu and Keith Marzullo

The proceedings are: Distributed Algorithms, Lecture Notes in Computer Science 1151, Springer, 1996.

Author's Version | DOI

Lamport showed that a replicated deterministic state machine is a general way to implement a highly available system, given a consensus algo-rithm that the replicas can use to agree on each input. His Paxos algorithm is the most fault-tolerant way to get consensus without real-time guarantees. Because general consensus is expensive, practical systems reserve it for emergencies and use leases (locks that time out) for most of the computing. This paper explains the general scheme for efficient highly available computing, gives a general method for understanding concurrent and fault-tolerant programs, and derives the Paxos algorithm as an example of the method.