ccc

Mechanically Checked Safety Proof of a Byzantine Paxos Algorithm

Leslie Lamport

Last modified on 25 July 2020

This web page links to the formal PlusCal specifications of the algorithms, and the machine-checked TLA+ proofs of their correctness, that are presented informally in the paper Byzantizing Paxos by Refinement. That paper describes an algorithm BPCon that abstracts and generalizes the consensus algorithm at the heart of the algorithm described in the seminal paper Practical Byzantine Fault Tolerance and Proactive Recovery by Miguel Castro and Barbara Liskov (ACM Transactions on Computer Systems 20, 4 (2002), pages 398-461). BPCon also admits two other implementations: a simple one that employs digital signatures and a previously unpublished one that, like the Castro-Liskov algorithm, relies only on message authenticators. It is derived by Byzantizing algorithm PCon, a variant of the classic Paxos consensus algorithm first published in The Part-Time Parliament by Leslie Lamport, (ACM Transactions on Computer Systems 16, 2 (1998), pages 133-169). (Paxos also appeared without proof in Brian Oki's Ph.D. thesis.) Although never presented in this form, PCon was essentially used as part of two previous algorithms: Vertical Paxos and an unpublished (but patented) version of Cheap Paxos.

What We've Done [show]

The Specifications [show]

Verifying the Refinements [show]

Each algorithm refines (implements) the previous one under a suitable refinement mapping. Before attempting any proofs, we checked the safety part of each refinement except the last with the TLC model checker, using large enough models that we felt confident that there were no coding errors. Our understanding of these algorithms is good enough that we were confident that they were basically correct before writing any proofs. We also model checked BPCon to find simple errors, but TLC cannot check it on large enough models to give any confidence in correctness of the algorithm.

The final refinement step, that BPCon refines PCon, has been proved and the proof completely checked by TLAPS, except that a few simple mathematical facts about finite sets are assumed without proof--for example, that a finite, non-empty set of integers contains a maximal element. These assumptions were checked with TLC to avoid possible errors in their statements. Theorems in library modules developed since the proofs were written would now make it easy to prove those statements.

We have written a proof of the first refinement step, that Voting refines Consensus, including the liveness properties. All safety properties were completely checked, except that we assumed a few of the same kind of simple mathematical facts as in the refinement proof of BPCon, where possible checking them with TLC. It would be nice to complete the entire refinement sequence by proving that PCon refines the safety part of the Voting spec. However, there is little motivation to do that because: (i) we already believe that PCon is correct, (ii) its proof would be less of a test of our ability to write and machine check such proofs than the refinement proof of BPCon, and (iii) a proof of correctness of ordinary Paxos has already been written and checked with TLAPS.

The proof that the Voting spec implements the liveness property of Consensus has not been completely checked, since TLAPS cannot yet reason about TLA's ENABLED operator or do non-propositional temporal logic reasoning.

This page can be found by searching the Web for the 23-letter string uidlamportbyzpaxosproof. Please do not put this string in any document that could wind up on the web--including email messages and Postscript and Word documents. You can refer to it in Web documents as the string obtained by removing the - from uid-lamportbyzpaxosproof.