Without a Toolkit--Slides

Butler Lampson

Microsoft

**Abstract**: In this talk I will describe the general
recipe for building a highly available system and show how to do it efficiently
in practice. The idea is to build a replicated state machine. The tricky part is getting all the replicas
to agree on the content and order of the inputs. For this you need a general fault-tolerant
algorithm for reaching consensus. Two-phase commit uses the simplest possible
such algorithm, but I will describe
Lamport's "Paxos" algorithm, which is the best one that works without
real-time guarantees. Since consensus is
expensive, you also need to minimize its use, and you do that with leases,
which are locks that time out.

The other theme of the talk is a general method for designing or understanding a complex system, especially a concurrent or fault-tolerant one. Using the Paxos algorithm as the example, I will show you how to write a precise spec for such a system and how to prove that an implementation satisfies the spec. There won't be any formulas in the talk.

Word, Acrobat, HTML.
A paper is here.