The Writings of Leslie Lamport

My Writings

Leslie Lamport

Last modified 2 February 2008

This document contains descriptions of almost all my technical papers and electronic versions of many of them for downloading.  Omitted are papers for which I no longer have copies and papers that are incomplete.  I have also omitted early versions of some of these papers--even in cases where the title changed.  Included are some initial drafts of papers that I abandoned before fixing errors or other problems in them.  A table of contents precedes the descriptions.  I also include a brief curriculum vitae.  A printable version of this document is available as a postscript, gzipped postscript, or pdf file. 

Each description attempts to explain the genesis of the work.  However, I have forgotten how I came to write most of my papers.  Even when I discourse at length about the development of the ideas, I am giving only a subjective view based on my unreliable memory.  Whenever possible, I have asked other people involved to check the accuracy of what I've written.  However, what I have most often forgotten is the work of others that influenced my own work.  This may give the impression that I am claiming more credit for ideas than I deserve, for which I apologize. 

Where I think it's interesting, I give the story behind the publication or non-publication of a paper.  Some of the stories read like complaints of unfair treatment by editors or referees.  Such cases are bound to arise in any activity based on human judgment.  On the whole, I have had little trouble getting my papers published.  In fact, I have profited from the natural tendency of editors and referees to be less critical of the work of established scientists.  But I think it's worth mentioning the cases where the system didn't work as it should. 

I would like to have ordered my papers by the date they were written.  However, I usually have no record of when I actually wrote something.  So, I have ordered them by date of publication or, for unpublished works, by the date of the version that I have.  Because of long and variable publication delays, this means that the order is only approximately chronological. 

Whenever possible, I have included electronic versions of the works.  At the moment, I have electronic versions mainly of works written after about 1985.  For journal articles, these may be "preprint" versions, formatted differently and sometimes differing slightly from the published versions.  I hope in the future to provide scanned versions of earlier publications. 


Education

Employment

Honors


This page can be found by searching the Web for the 23-letter string alllamportspubsontheweb.  Please do not put this string in any document that might appear on the Web-including email messages and Postscript, PDF, and Word documents.  One way to refer to it in Web documents is "the string obtained by removing the - characters from the string alllam-portspu-bsonth-eweb."


Contents

  1. Summer Vision Programs
  2. Preliminary User's Guide to Monitor 1
  3. Untitled Draft of Advanced Calculus Text
  4. The Geometry of Space and Time
  5. Comment on Bell's Quadratic Quotient Algorithm
  6. The Analytic Cauchy Problem with Singular Data
  7. An Extension of a Theorem of Hamada on the Cauchy Problem with Singular Data
  8. The Coordinate Method for the Parallel Execution of DO Loops
  9. The Parallel Execution of DO Loops
  10. The Hyperplane Method for an Array Computer
  11. A New Solution of Dijkstra's Concurrent Programming Problem
  12. On Self-stabilizing Systems
  13. On Programming Parallel Computers
  14. Parallel Execution on Array and Vector Computers
  15. Multiple Byte Processing with Full-Word Instructions
  16. The Synchronization of Independent Processes
  17. Comments on `A Synchronization Anomaly'
  18. Garbage Collection with Multiple Processes: an Exercise in Parallelism
  19. The Coordinate Method for the Parallel Execution of Iterative Loops
  20. Towards a Theory of Correctness for Multi-User Data Base Systems
  21. On the Glitch Phenomenon
  22. Proving the Correctness of Multiprocess Programs
  23. Formal Correctness Proofs for Multiprocess Algorithms
  24. On Concurrent Reading and Writing
  25. State the Problem Before Describing the Solution
  26. Time, Clocks and the Ordering of Events in a Distributed System
  27. The Specification and Proof of Correctness of Interactive Programs
  28. The Implementation of Reliable Distributed Multiprocess Systems
  29. SIFT: Design and Analysis of a Fault-Tolerant Computer for Aircraft Control
  30. On-the-fly Garbage Collection: an Exercise in Cooperation
  31. A General Construction for Expressing Repetition
  32. A New Approach to Proving the Correctness of Multiprocess Programs
  33. How to Present a Paper
  34. How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs
  35. Constructing Digital Signatures from a One Way Function
  36. On the Proof of Correctness of a Calendar Program
  37. Letter to the Editor
  38. `Sometime' is Sometimes `Not Never'
  39. The `Hoare Logic' of Concurrent Programs
  40. Reaching Agreement in the Presence of Faults
  41. Program Verification: An Approach to Reliable Hardware and Software
  42. Password Authentication with Insecure Communication
  43. TIMESETS--A New Method for Temporal Reasoning About Programs
  44. Byzantine Generals and Transaction Commit Protocols
  45. The Byzantine Generals Problem
  46. Proving Liveness Properties of Concurrent Programs
  47. An Assertional Correctness Proof of a Distributed Program
  48. Reasoning About Nonatomic Operations
  49. Specifying Concurrent Program Modules
  50. Specification and Proof of a Fault-Tolerant Real-Time Algorithm
  51. The Weak Byzantine Generals Problem
  52. PHIL: A Semantic Structural Graphical Editor
  53. What Good Is Temporal Logic?
  54. Using Time Instead of Timeout for Fault-Tolerant Distributed Systems
  55. The Hoare Logic Of CSP, and All That
  56. Byzantine Clock Synchronization
  57. Solved Problems, Unsolved Problems and NonProblems in Concurrency
  58. On a "Theorem" of Peterson
  59. Buridan's Principle
  60. The Mutual Exclusion Problem--Part I: A Theory of Interprocess Communication, Part II: Statement and Solutions
  61. Synchronizing Clocks in the Presence of Faults
  62. What It Means for a Concurrent Program to Satisfy a Specification: Why No One Has Specified Priority
  63. Constraints: A Uniform Approach to Aliasing and Typing
  64. Recursive Compiling and Programming Environments (Summary)
  65. Distributed Snapshots: Determining Global States of a Distributed System
  66. Formal Foundation for Specification and Verification
  67. An Axiomatic Semantics of Concurrent Programming Languages
  68. LaTeX: A Document Preparation System
  69. On Interprocess Communication--Part I: Basic Formalism, Part II: Algorithms
  70. The Byzantine Generals
  71. A Formal Basis for the Specification of Concurrent Systems
  72. A Fast Mutual Exclusion Algorithm
  73. Derivation of a Simple Synchronization Algorithm
  74. Distribution
  75. Document Production: Visual or Logical?
  76. Synchronizing Time Servers
  77. Control Predicates Are Better than Dummy Variables for Representing Program Control
  78. "EWD 1013"
  79. Another Position Paper on Fairness
  80. A Lattice-Structured Proof of a Minimum Spanning Tree Algorithm
  81. A Simple Approach to Specifying Concurrent Systems
  82. Pretending Atomicity
  83. Realizable and Unrealizable Specifications of Reactive Systems
  84. A Temporal Logic of Actions
  85. win and sin: Predicate Transformers for Concurrency
  86. A Theorem on Atomicity in Distributed Algorithms
  87. Distributed Computing: Models and Methods
  88. A Completeness Theorem for TLA
  89. The Concurrent Reading and Writing of Clocks
  90. The Mutual Exclusion Problem Has Been Solved
  91. The Existence of Refinement Mappings
  92. Preserving Liveness: Comments on `Safety and Liveness from a Methodological Point of View'
  93. Critique of the Lake Arrowhead Three
  94. The Reduction Theorem
  95. Mechanical Verification of Concurrent Systems with TLA
  96. Composing Specifications
  97. Verification of a Multiplier: 64 Bits and Beyond
  98. Verification and Specification of Concurrent Programs
  99. Hybrid Systems in TLA+
  100. How to Write a Proof
  101. The Temporal Logic of Actions
  102. Decomposing Specifications of Concurrent Systems
  103. Open Systems in TLA
  104. TLZ (Abstract)
  105. An Old-Fashioned Recipe for Real Time
  106. Specifying and Verifying Fault-Tolerant Systems
  107. How to Write a Long Formula
  108. Introduction to TLA
  109. Adding "Process Algebra" to TLA
  110. What Process Algebra Proofs Use Instead of Invariance
  111. Conjoining Specifications
  112. TLA in Pictures
  113. The RPC-Memory Specification Problem: Problem Statement
  114. A TLA Solution to the RPC-Memory Specification Problem
  115. How to Tell a Program from an Automobile
  116. Refinement in State-Based Formalisms
  117. Marching to Many Distant Drummers
  118. Processes are in the Eye of the Beholder
  119. How to Make a Correct Multiprocess Program Execute Correctly on a Multiprocessor
  120. Substitution: Syntactic versus Semantic
  121. The Part-Time Parliament
  122. Reduction in TLA
  123. Composition: A Way to Make Proofs Harder
  124. Proving Possibility Properties
  125. A Lazy Caching Proof in TLA
  126. Specifying Concurrent Systems with TLA+
  127. TLA+ Verification of Cache-Coherence Protocols
  128. Should Your Specification Language Be Typed?
  129. Model Checking TLA+ Specifications
  130. How (La)TeX changed the face of Mathematics
  131. Fairness and Hyperfairness
  132. Archival References to Web Pages
  133. Disk Paxos
  134. Disk Paxos (Conference Version)
  135. When Does a Correct Mutual Exclusion Algorithm Guarantee Mutual Exclusion
  136. Lower Bounds on Consensus
  137. The Wildfire Challenge Problem
  138. Paxos Made Simple
  139. Specifying and Verifying Systems with TLA+
  140. Arbiter-Free Synchronization
  141. A Discussion With Leslie Lamport
  142. Lower Bounds for Asynchronous Consensus
  143. Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers
  144. Checking Cache-Coherence Protocols with TLA+
  145. High-Level Specifications: Lessons from Industry
  146. The Future of Computing: Logic or Biology
  147. Consensus on Transaction Commit
  148. On Hair Color in France
  149. Formal Specification of a Web Services Protocol
  150. Cheap Paxos
  151. Implementing and Combining Specifications
  152. Lower Bounds for Asynchronous Consensus
  153. Generalized Consensus and Paxos
  154. Real Time is Really Simple
  155. How Fast Can Eventual Synchrony Lead to Consensus?
  156. Real-Time Model Checking is Really Simple
  157. Fast Paxos
  158. Measuring Celebrity
  159. Checking a Multithreaded Algorithm with +CAL
  160. The +CAL Algorithm Language
  161. TLA+
  162. Implementing Dataflow With Threads
  163. Leslie Lamport: The Specification Language TLA+
  164. Computation and State Machines


The Papers

  1. Summer Vision Programs
    Massachusetts Institute of Technology, Project MAC Memorandum MAC-M-332, Artificial Intelligence Project Memo Number Vision 111 (October 1966).
    PDF
    In the summer of 1966, I worked at the M.I.T. Artificial Intelligence Laboratory, doing Lisp programming for a computer vision project.  I have no memory of this document, but it appears to describe the programs I wrote that summer.  It's of no technical interest, but it does show that, even in those days, I was writing precise documentation. 
     

  2. Preliminary User's Guide to Monitor 1  (with Roland Silver)
    Mitre Technical Report (December 1966).
    No electronic version available.
    While in graduate school, I worked summers and part-time at the Mitre Corporation from 1962 to 1965.  I think I wrote three or four technical reports there, but this is the only one the Mitre library seems to have.  A large part of my time at Mitre was spent working on the operating system for a computer being built there called Phoenix.  This is the operating system's manual, apparently written by Silver based on work we had both done.  There is nothing of technical interest here, but it provides a snapshot of what was going on in the world of computers in the early 60s. 
     

  3. Untitled Draft of Advanced Calculus Text
    Unpublished (circa 1967).
    No electronic version available.
    During the 1965-1969 academic years, I taught math at Marlboro College.  I don't remember exactly when or how the project got started, but I wrote the first draft of an advanced calculus textbook for Prentice-Hall, from whom I received an advance of 500 dollars.  (That sum, which seems ridiculously small now, was a significant fraction of my salary at the time.)  The Prentice-Hall reviewers liked the draft.  I remember one reviewer commenting that the chapter on exterior algebra gave him, for the first time, an intuitive understanding of the topic.  However, because of a letter that was apparently lost in the mail, the Prentice-Hall editor and I both thought that the other had lost interest in the project.  By the time this misunderstanding had been cleared up, I was ready to move on to other things and didn't feel like writing a final draft.  (This was before the days of computer text processing, so writing a new draft meant completely retyping hundreds of pages of manuscript.) 
     

  4. The Geometry of Space and Time
    Unpublished (circa 1968).
    No electronic version available.
    Marlboro College, where I taught math from 1965-1969, had a weekly series of lectures for the general public, each given by a faculty member or an outside speaker invited by a faculty member.  I gave a lecture about relativity that I later turned into this short monograph.  I made a half-hearted, unsuccessful effort to get it published.  But it was too short (75 pages) to be a "real" book, and there was very little interest in science among the general public in the late sixties.  I think this monograph is still a very good exposition of the subject.  Unfortunately, the second half, on general relativity, is obsolete because it says nothing about black holes.  While black holes appear in the earliest mathematical solutions to the equations of general relativity, it was only in the late 60s that many physicists began seriously to consider that they might exist and to study their properties. 
     

  5. Comment on Bell's Quadratic Quotient Algorithm
    Communications of the ACM 13, 9   (September 1970).
    PDF
    Copyright © 1970 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    This short note describes a minor inefficiency I noticed in a hash-table algorithm published by James Bell.  It got me thinking about hash tables, and I invented what I called the linear quotient algorithm--an algorithm that seems quite obvious in retrospect.  While I was running simulations to gather data for a paper on that algorithm, the latest issue of CACM arrived with a paper by Bell and Charles Kaman titled The Linear Quotient Hash Code.  I had devised three variants of the algorithm not contained in their article.  So, I wrote a paper about those three variants and submitted it to CACM.  The editor rejected it, without sending it out for review, saying that it was too small a contribution to merit publication.  In the next few years, CACM published two papers by others on the subject, each completely devoted to one of my three variants.  (Those papers had been submitted to different editors.)  My paper, which was probably a Massachusetts Computer Associates (Compass) technical report, has been lost.  (Compass went out of business a few years ago, and I presume that its library was destroyed.)  The linear quotient method is probably the most common hash-coding algorithm used today. 
     

  6. The Analytic Cauchy Problem with Singular Data
    Ph.D. Thesis, Brandeis University (1972). Available from UMI
    here, as number 7232105.
    No electronic version available.
    I left Marlboro College and went back to Brandeis in 1969 to complete my Ph.D.  At that time, I intended to study and write a thesis in mathematical physics.  However, I wound up doing a thesis in pure mathematics, on analytic partial differential equations.  I learned nothing about analytic partial differential equations except what was needed for my thesis research, and I have never looked at them since then.  The thesis itself was a small, solid piece of very classical math.  Had Cauchy arisen from the grave to read it, he would have found nothing unfamiliar in the mathematics. 
     

  7. An Extension of a Theorem of Hamada on the Cauchy Problem with Singular Data
    Bulletin of the Amer.  Math.  Society 79, 4 (July 1973), 776-780.
    No electronic version available.
    At the time, and perhaps still today, a math student "copyrighted" his (seldom her) thesis results by announcing them in a short note in the Bulletin of the AMS.  Normally, a complete paper would be published later.  But I never did that, since I left math for computer science after completing my thesis. 
     

  8. The Coordinate Method for the Parallel Execution of DO Loops
    Proceedings of the 1973 Sagamore Conference on Parallel Processing, T. Feng, ed., 1-12.
    No electronic version available.
    Compass (Massachusetts Computer Associates) had a contract to write the Fortran compiler for the Illiac-IV computer, an array computer with 64 processors that all operated in lock-step on a single instruction stream.  I developed the theory and associated algorithms for executing sequential DO loops in parallel on an array computer that were used by the compiler.  The theory is pretty straightforward.  The creativity lay in the proper mathematical formulation of the problem.  Today, it would be considered a pretty elementary piece of work.  But in those days, we were not as adept at applying math to programming problems.  Indeed, when I wrote up a complete description of my work for my colleagues at Compass, they seemed to treat it as a sacred text, requiring spiritual enlightenment to interpret the occult mysteries of linear algebra. 

    Anyway, this paper was my first pass at writing up the complete version of the theory for publication.  I strongly suspect that it has never been read.  No-one seems to have noticed that, because of a text-editing error, the description of the algorithm is missing the punch line that says what can be executed in parallel.  This paper is superseded by the unpublished [19]
     

  9. The Parallel Execution of DO Loops
    Communications of the ACM 17, 2   (February 1974), 83-93.
    PDF
    Copyright © 1974 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    This is the only journal paper to come out of the work mentioned in the description of [8].  It contains essentially the special case of the results in [8] for a single tight nesting of loops.  It was one of the early articles on the topic, and I believe it was cited fairly often. 
     

  10. The Hyperplane Method for an Array Computer
    Proceedings of the 1974 Sagamore Conference on Parallel Processing, T. Feng, ed., Springer Verlag, 1-12.
    No electronic version available.
    In the late 19th century, the Vanderbilts (a rich American family) owned 500 hectares of beautiful land in the central Adirondack Mountains of New York State that included Sagamore Lake, which is about a kilometer in length.  There, they built a rustic summer estate.  In the 70s, and probably still today, the place was completely isolated, away from any other signs of civilization.  The estate, with land and lake, was given to Syracuse University, which operated it as a conference center.  An annual conference on parallel processing was held there late in the summers of 1973 through 1975.  Sagamore was the most beautiful conference site I have ever seen.  The conference wasn't bad, with a few good people attending, though it wasn't first rate.  But I would have endured a conference on medieval theology for the opportunity to canoe on, swim in, and walk around the lake. 

    To justify my attendance at Sagamore, I always submitted a paper.  But once I discovered that it was not a first-rate conference, I did not submit first-rate papers.  However, I don't republish old material (except as necessary to avoid forcing people to read earlier papers), so this paper must have included some new results about the hyperplane method for parallelizing sequential loops.  However, I can't find a copy of the paper and don't remember what was in it. 
     

  11. A New Solution of Dijkstra's Concurrent Programming Problem
    Communications of the ACM 17, 8   (August 1974), 453-455.
    PDF
    Copyright © 1974 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    This paper describes the bakery algorithm for implementing mutual exclusion.  I have invented many concurrent algorithms.  I feel that I did not invent the bakery algorithm, I discovered it.  Like all shared-memory synchronization algorithms, the bakery algorithm requires that one process be able to read a word of memory while another process is writing it.  (Each memory location is written by only one process, so concurrent writing never occurs.)  Unlike any previous algorithm, and almost all subsequent algorithms, the bakery algorithm works regardless of what value is obtained by a read that overlaps a write.  If the write changes the value from 0 to 1, a concurrent read could obtain the value 7456 (assuming that 7456 is a value that could be in the memory location).  The algorithm still works.  I didn't try to devise an algorithm with this property.  I discovered that the bakery algorithm had this property after writing a proof of its correctness and noticing that the proof did not depend on what value is returned by a read that overlaps a write. 

    I don't know how many people realize how remarkable this algorithm is.  Perhaps the person who realized it better than anyone is Anatol Holt, a former colleague at Massachusetts Computer Associates.  When I showed him the algorithm and its proof and pointed out its amazing property, he was shocked.  He refused to believe it could be true.  He could find nothing wrong with my proof, but he was certain there must be a flaw.  He left that night determined to find it.  I don't know when he finally reconciled himself to the algorithm's correctness. 

    Several books have included emasculated versions of the algorithm in which reading and writing are atomic operations, and called those versions "the bakery algorithm".  I find that deplorable.  There's nothing wrong with publishing a simplified version, as long as it's called a simplified version. 

    What is significant about the bakery algorithm is that it implements mutual exclusion without relying on any lower-level mutual exclusion.  Assuming that reads and writes of a memory location are atomic actions, as previous mutual exclusion algorithms had done, is tantamount to assuming mutually exclusive access to the location.  So a mutual exclusion algorithm that assumes atomics reads and writes is assuming lower-level mutual exclusion.  Such an algorithm cannot really be said to solve the mutual exclusion problem.  Before the bakery algorithm, people believed that the mutual exclusion problem was unsolvable--that you could implement mutual exclusion only by using lower-level mutual exclusion.  Brinch Hansen said exactly this in a 1972 paper.  Many people apparently still believe it.  (See [90].) 

    The paper itself does not state that it is a "true" mutual exclusion algorithm.  This suggests that I didn't realize the full significance of the algorithm until later.  But I don't remember. 

    For a couple of years after my discovery of the bakery algorithm, everything I learned about concurrency came from studying it.  Papers like [24], [32], and [69] were direct results of that study.  The bakery algorithm was also where I introduced the idea of variables belonging to a process--that is, variables that could be read by multiple processes, but written by only a single process.  I was aware from the beginning that such algorithms had simple distributed implementations, where the variable resides at the owning process, and other processes read it by sending messages to the owner.  Thus, the bakery algorithm marked the beginning of my study of distributed algorithms. 

    The paper contains one small but significant error.  In a footnote, it claims that we can consider reads and writes of a single bit to be atomic.  It argues that a read overlapping a write must get one of the two possible values; if it gets the old value, we can consider the read to have preceded the write, otherwise to have followed it.  It was only later, with the work eventually described in [69], that I realized the fallacy in this reasoning. 
     

  12. On Self-stabilizing Systems
    Massachusetts Computer Associates Technical Report CA 7412-0511 (5 December 1974).
    No electronic version available.
    This note was written upon reading Dijkstra's classic paper "Self-stabilizing Systems in Spite of Distributed Control" that appeared in the November 1974 issue of CACM (see
    [57]).  It generalizes one of the algorithms in Dijkstra's paper from a line of processes to an arbitrary tree of processes.  It also discusses the self-stabilizing properties of the bakery algorithm.  I never tried to publish this note--probably because I regarded it as too small a piece of work to be worth a paper by itself. 

    The note contains the intriguing sentence: "There is a complicated modified version of the bakery algorithm in which the values of all variables are bounded." I never wrote down that version, and I'm not sure what I had in mind.  But I think I was thinking of roughly the following modification.  As a process waits to enter its critical section, it keeps reducing its number, not entering the critical section until its number equals one.  A process p can reduce its number by at most one, and only when the next lower-numbered process's number is at least two less than p's number, and the next higher-numbered process is within one of p's number.  I think I intended to use the techniques of [24] to allow reading and writing of numbers to remain non-atomic while maintaining the order of waiting processes.  (If eventually all processes stop changing their numbers, then all processes will eventually read the correct numbers, allowing some process to progress.)  At one time, I convinced myself that this algorithm is correct.  But I never wrote a rigorous proof, so I don't know if it really works.  Filling in the details and proving correctness should be a nice exercise. 
     

  13. On Programming Parallel Computers
    Proceedings of a Conference on Programming Languages and Compilers for Parallel and Vector Machines, published as ACM SIGPLAN Notices 10, 3 (March 1975), 25-33.
    No electronic version available.
    This is a position paper advocating the use of a higher-level language that expresses what must be computed rather than how it is to be computed.  It argues that compilers are better than humans at generating efficient code for parallel machines.  I was so naive then! I soon learned how bad compilers really were, and how trying to make them smarter just made them buggier.  But compiler writers have gotten a lot better, so maybe this paper isn't as stupid now as it was then. 
     

  14. Parallel Execution on Array and Vector Computers
    Proceedings of the 1975 Sagamore Conference on Parallel Processing, T. Feng, ed., 187-191.
    No electronic version available.
    This paper considers the problem of efficiently executing a sequence of explicitly parallel statements--ones requiring simultaneous execution for all values of a parameter--when there are more parameter values than there are processors.  It is one of the lesser papers that I saved for the Sagamore Conference.  (See the discussion of
    [10].) 
     

  15. Multiple Byte Processing with Full-Word Instructions
    Communications of the ACM 18, 8   (August 1975), 471-475.
    PDF
    Copyright © 1975 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    My algorithms for parallelizing loops, described in papers starting with [8], were rather inefficient.  They could be sped up with parallel execution on an array processor like the Illiac-IV.  But I realized one could do even better than the 64-times speedup provided by the Illiac's 64 processors.  Each datum being manipulated was just a few bits, so I had the idea of packing several of the data into a single word and manipulating them simultaneously.  Not only could this speed computation on the Illiac, but it allowed one to do array processing on an ordinary uniprocessor.  This paper describes general techniques for doing such parallel computation on packed data.  It's a neat hack, and it's more useful now than it was then for two reasons.  The obvious reason is that word size is larger now, with many computers having 64-bit words.  The less obvious reason is that conditional operations are implemented with masking rather than branching.  Instead of branching around the operation when the condition is not met, masks are constructed so the operation is performed only on those data items for which the condition is true.  Branching is more costly on modern multi-issue computers than it was on the computers of the 70s. 
     

  16. The Synchronization of Independent Processes
    Acta Informatica 7, 1 (1976), 15-34.
    No electronic version available.
    There are a class of synchronization problems that are direct generalizations of mutual exclusion in that they assert constraints on when a process is allowed to perform a task.  They include the dining philosophers problem and the readers/writers problem.  This paper shows how a modified version of the bakery algorithm can be used to solve any such problem. 

    A referee said of the initial submission that the proofs were too long, tediously proving the obvious.  In fact, there was a bug in the initial version, and at least one of those obvious proofs was of a false statement.  Needless to say, I corrected the algorithm and wrote more careful proofs. 
     

  17. Comments on `A Synchronization Anomaly'
    Information Processing Letters 4, 4 (January 1976), 88-89.
    No electronic version available.
    This is a comment on a short note by Richard Lipton and Robert Tuttle claiming to find an inadequacy in Dijkstra's P and V synchronization primitives.  It points out that they had introduced a red herring because the problem that those primitives couldn't solve could not be stated in terms of entities observable within the system.  As my note states: "A system cannot be correct unless its correctness depends only upon events and conditions observable within the system." That's something worth remembering, since I've encountered that same sort of red herring on other occasions.  My observation is relevant to
    [62], but I had forgotten all about this note by the time I wrote [62]
     

  18. Garbage Collection with Multiple Processes: an Exercise in Parallelism
    Proceedings of the 1976 International Conference on Parallel Processing, T. Feng, ed., 50-54.
    No electronic version available.
    This is a minor extension to the concurrent garbage collection algorithm of
    [30].  That algorithm uses a single garbage collector process running in parallel with the "mutator" process that creates the garbage.  This paper describes how to use multiple processes to do the collection, and how to handle multiple mutator processes.  It is a minor work that I wrote up as an excuse for going to the Sagamore conference.  (See the discussion of [10].)  However, the conference had been renamed and moved from Sagamore to a less attractive, mosquito-infected site in the Michigan woods.  It was the last time I attended. 
     

  19. The Coordinate Method for the Parallel Execution of Iterative Loops
    Unpublished (August 1976).
    No electronic version available.
    This is a totally revised version of
    [8], complete with proofs (which had been omitted from [8]).  I submitted it to CACM, but the editor of CACM decided that it was more appropriate for JACM.  When I submitted it there, the editor of JACM rejected it without sending it out for review because the basic ideas had already appeared in the conference version [8].  I was young and inexperienced, so I just accepted the editor's decision.  I began to suspect that something was amiss a year or two later, when a paper from the same conference was republished verbatim in CACM.  By the time I realized how crazy the editor's decision had been, it didn't seem worth the effort of resubmitting the paper. 
     

  20. Towards a Theory of Correctness for Multi-User Data Base Systems
    Rejected by the 1977 IFIP Congress (October 1976).
    No electronic version available.
    This paper was a rough draft of some ideas, not all of which were correct.  I don't remember how it became known, but I received requests for copies for years afterwards. 
     

  21. On the Glitch Phenomenon  (with Richard Palais)
    Rejected by IEEE Transactions on Computers (November 1976).
    Postscript   -   Compressed Postscript   -   PDF
    When I wrote [11], a colleague at Massachusetts Computer Associates pointed out that the concurrent reading and writing of a single register, assumed in the bakery algorithm, requires an arbiter--a device for making a binary decision based on inputs that may be changing.  In the early 70s, computer designers rediscovered that it's impossible to build an arbiter that is guaranteed to reach a decision in a bounded length of time.  (This had been realized in the 50s but had been forgotten.)  My colleague's observation led to my interest in the arbiter problem--or "glitch" problem, as it was sometimes called. 

    The basic proof that an arbiter cannot have a bounded response time uses continuity to demonstrate that, if there are two inputs that can drive a flip-flop into two different states, then there must exist an input that makes the flip-flop hang.  At the time, it was very difficult to convince someone that this argument was valid.  They seemed to believe that, because a flip-flop has only discrete stable states, continuity doesn't apply. 

    I described the arbiter problem to Palais, who had been my de jure thesis adviser and afterwards became a colleague and a friend.  He recognized that the correct mathematical way to view what was going on is in terms of the compact-open topology on the space of flip-flop behaviors.  So, we wrote this paper to explain why the apparently discontinuous behavior of an arbiter is actually continuous in the appropriate topology. 

    This paper was rejected by the IEEE Transactions on Computers because the engineers who reviewed it couldn't understand the mathematics.  Six years later, the journal apparently acquired more mathematically sophisticated reviewers, and it published a less general result with a more complicated proof.  I believe someone has finally published a paper on the subject that does supersede ours. 
     

  22. Proving the Correctness of Multiprocess Programs
    IEEE Transactions on Software Engineering SE-3, 2 (March 1977), 125-143.
    No electronic version available.
    When I first learned about the mutual exclusion problem, it seemed easy and the published algorithms seemed needlessly complicated.  So, I dashed off a simple algorithm and submitted it to CACM.  I soon received a referee's report pointing out the error.  This had two effects.  First, it made me mad enough at myself to sit down and come up with a real solution.  The result was the bakery algorithm described in
    [11].  The second effect was to arouse my interest in verifying concurrent algorithms.  This has been a very practical interest.  I want to verify the algorithms that I write.  A method that I don't think is practical for my everyday use doesn't interest me. 

    In the course of my work on parallelizing sequential code (see [9]), I essentially rediscovered Floyd's method as a way of extracting properties of a program.  When I showed a colleague what I was doing, he went to our library at Massachusetts Computer Associates and gave me a copy of the original tech report version of Floyd's classic paper Assigning Meanings to Programs.  I don't remember when I read Hoare's An Axiomatic Basis for Computer Programming, but it was probably not long afterwards. 

    In the mid-70s, several people were thinking about the problem of verifying concurrent programs.  The seminal paper was Ed Ashcroft's Proving Assertions About Parallel Programs, published in the Journal of Computer and System Sciences in 1975.  That paper introduced the fundamental idea of invariance.  I discovered how to use the idea of invariance to generalize Floyd's method to multiprocess programs.  As is so often the case, in retrospect the idea seems completely obvious.  However, it took me a while to come to it.  I remember that, at one point, I thought that a proof would require induction on the number of processes. 

    This paper introduced the concepts of safety and liveness as the proper generalizations of partial correctness and termination to concurrent programs.  It also introduced the terms "safety" and "liveness" for those classes of properties.  I stole those terms from Petri nets, where they have similar but formally very different meanings.  (Safety of a Petri net is a particular safety property; liveness of a Petri net is not a liveness property.) 

    At the same time I was devising my method, Susan Owicki was writing her thesis at Cornell under David Gries and coming up with very much the same ideas.  The generalization of Floyd's method for proving safety properties of concurrent programs became known as the Owicki-Gries method.  Owicki and Gries did not do anything comparable to the method for proving liveness in my paper.  (Nissim Francez and Amir Pnueli developed a general proof method that did handle liveness properties, but it lacked a nice way of proving invariance properties.)  My method had deficiencies that were corrected with the introduction of temporal logic, discussed in [46]

    The Owicki-Gries version of the method for proving safety properties differed from mine in two ways.  The significant way was that I made the control state explicit, while they had no way to talk about it directly.  Instead, they introduced dummy variables to capture the control state.  The insignificant way was that I used a flowchart language while they used an Algol-like language. 

    The insignificant syntactic difference in the methods turned out to have important ramifications.  For writing simple concurrent algorithms, flowcharts are actually better than conventional toy programming languages because they make the atomic actions, and hence the control state, explicit.  However, by the mid-70s, flowcharts were passé and structured programming was all the rage, so my paper was forgotten and people read only theirs.  The paper was rediscovered about ten years later, and is now generally cited alongside theirs in the mandatory references to previous work. 

    More important though is that, because they had used a "structured" language, Owicki and Gries thought that they had generalized Hoare's method.  To the extent that Floyd's and Hoare's methods are different, it is because Hoare's method is based on the idea of hierarchical decomposition of proofs.  The Owicki-Gries method doesn't permit this kind of clean hierarchical decomposition.  Gries, commenting in 1999, said: "We hardly ever looked at Floyd's work and simply did everything based on Hoare's axiomatic theory." I suspect that, because they weren't thinking at all about Floyd's approach, they didn't notice the difference between the two, and thus they didn't realize that they were generalizing Floyd's method and not Hoare's. 

    The result of presenting a generalization of Floyd's method in Hoare's clothing was to confuse everyone.  For a period of about ten years, hardly anyone really understood that the Owicki-Gries method was just a particular way of structuring the proof of a global invariant.  I can think of no better illustration of this confusion than the EWD A Personal Summary of the Owicki-Gries Theory that Dijkstra wrote and subsequently published in a book of his favorite EWDs.( Throughout his career, Edsger Dijkstra wrote a series of notes identified by an "EWD" number.) If even someone as smart and generally clear-thinking as Dijkstra could write such a confusing hodge-podge of an explanation, imagine how befuddled others must have been.  A true generalization of Hoare's method to concurrent programs didn't come until several years later in [39]

    I think it soon became evident that one wanted to talk explicitly about the control state.  Susan Owicki obviously agreed, since we introduced the at, in, and after predicates for doing just that in [46].  Quite a bit later, I had more to say about dummy variables versus control state in [77]

    Dummy variables were more than just an ugly hack to avoid control variables.  They also allowed you to capture history.  Adding history variables makes it possible to introduce behavioral reasoning into an assertional proof.  (In the limit, you can add a variable that captures the entire history and clothe a completely behavioral proof in an assertional framework.)  What a program does next depends on its current state, not on its history.  Therefore, a proof that is based on a history variable doesn't capture the real reason why a program works.  I've always found that proofs that don't use history variables teach you more about the algorithm.  (As shown in [91], history variables may be necessary if the correctness conditions themselves are in terms of history.) 

    When we developed our methods, Owicki and I and most everyone else thought that the Owicki-Gries method was a great improvement over Ashcroft's method because it used the program text to decompose the proof.  I've since come to realize that this was a mistake.  It's better to write a global invariant.  Writing the invariant as an annotation allows you to hide some of the explicit dependence of the invariant on the control state.  However, even if you're writing your algorithm as a program, more often than not, writing the invariant as an annotation rather than a single global invariant makes things more complicated.  But even worse, an annotation gets you thinking in terms of separate assertions rather than in terms of a single global invariant.  And it's the global invariant that's important.  Ashcroft got it right.  Owicki and Gries and I just messed things up.  It took me quite a while to figure this out. 

    Sometime during the '80s, Jay Misra noticed that the definition of well-foundedness (Definition 8 on page 136) is obviously incorrect. 
     

  23. Formal Correctness Proofs for Multiprocess Algorithms
    Programmation-2me Colloque International, B. Robinet, ed., Dunod, Paris (1977), 1-8.
    No electronic version available.
    This is an abbreviated, conference version of
    [22].  (In those days, publication was fast enough that the journal version could appear before the conference version.)  I wrote this paper as an excuse for attending a conference in Paris.  However, it turned out that I went to Europe for other reasons that made it impossible for me to attend the conference.  I tried to withdraw the paper, but it was too late. 
     

  24. On Concurrent Reading and Writing
    Communications of the ACM 20, 11   (November 1977), 806-811.
    PDF
    Copyright © 1977 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    This paper came out of my study of the bakery algorithm of [11].  The problem with that algorithm is that it requires unbounded state.  To allow the state to be bounded in practice, I needed an algorithm for reading and writing multidigit numbers, one digit at a time, so that a read does not obtain too large a value if it overlaps a write.  This paper shows that this can be done by simply writing the digits in one direction and reading them in the other.  It also has some other nice algorithms.  The paper assumes that reading and writing a single digit are atomic operations.  The original version introduced the notion of a regular register and proved the results under the weaker assumption that the individual digits were regular.  However, the editor found the idea of nonatomic reads and writes to individual digits too heretical for CACM readers, and he insisted that I make the stronger assumption of atomicity.  So, the world had to wait another decade, until the publication of [69], to learn about regular registers. 
     

  25. State the Problem Before Describing the Solution
    ACM SIGSOFT Software Engineering Notes 3, 1 (January 1978) 26.
    PDF
    Copyright © 1978 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    The title says it all.  This one-page note is as relevant today as when I wrote it.  Replace "describing the solution" by "writing the program" and it becomes a practical recipe for improving software. 
     

  26. Time, Clocks and the Ordering of Events in a Distributed System
    Communications of the ACM 21, 7   (July 1978), 558-565.  Reprinted in several collections, including Distributed Computing: Concepts and Implementations, McEntire et al., ed.  IEEE Press, 1984.
    PDF
    Copyright © 1978 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    Jim Gray once told me that he had heard two different opinions of this paper: that it's trivial and that it's brilliant.  I can't argue with the former, and I am disinclined to argue with the latter. 

    The origin of this paper was a note titled The Maintenance of Duplicate Databases by Paul Johnson and Bob Thomas.  I believe their note introduced the idea of using message timestamps in a distributed algorithm.  I happen to have a solid, visceral understanding of special relativity (see [4]).  This enabled me to grasp immediately the essence of what they were trying to do.  Special relativity teaches us that there is no invariant total ordering of events in space-time; different observers can disagree about which of two events happened first.  There is only a partial order in which an event e1 precedes an event e2 iff e1 can causally affect e2.  I realized that the essence of Johnson and Thomas's algorithm was the use of timestamps to provide a total ordering of events that was consistent with the causal order.  This realization may have been brilliant.  Having realized it, everything else was trivial.  Because Thomas and Johnson didn't understand exactly what they were doing, they didn't get the algorithm quite right; their algorithm permitted anomalous behavior that essentially violated causality.  I quickly wrote a short note pointing this out and correcting the algorithm. 

    It didn't take me long to realize that an algorithm for totally ordering events could be used to implement any distributed system.  A distributed system can be described as a particular sequential state machine that is implemented with a network of processors.  The ability to totally order the input requests leads immediately to an algorithm to implement an arbitrary state machine by a network of processors, and hence to implement any distributed system.  So, I wrote this paper, which is about how to implement an arbitrary distributed state machine.  As an illustration, I used the simplest example of a distributed system I could think of--a distributed mutual exclusion algorithm. 

    This is my most often cited paper.  Many computer scientists claim to have read it.  But I have rarely encountered anyone who was aware that the paper said anything about state machines.  People seem to think that it is about either the causality relation on events in a distributed system, or the distributed mutual exclusion problem.  People have insisted that there is nothing about state machines in the paper.  I've even had to go back and reread it to convince myself that I really did remember what I had written. 

    The paper describes the synchronization of logical clocks.  As something of an afterthought, I decided to see what kind of synchronization it provided for real-time clocks.  So, I included a theorem about real-time synchronization.  I was rather surprised by how difficult the proof turned out to be.  This was an indication of what lay ahead in [61]

    This paper won the 2000 PODC Influential Paper Award (later renamed the Edsger W. Dijkstra Prize in Distributed Computing).  It won an ACM SIGOPS Hall of Fame Award in 2007. 
     

  27. The Specification and Proof of Correctness of Interactive Programs
    Proceedings of the International Conference on Mathematical Studies of Information Processing Kyoto, Japan (August, 1978), 477-540.
    No electronic version available.
    In the late 70s, people were talking about designing programming languages that would make program verification easier.  I didn't think much of that idea.  I felt that the difficulty in verification comes from the algorithm, not the details of the programming language in which it is described.  To demonstrate this view, I published in this paper a proof of correctness of a TECO program. 

    TECO stands for Text Editing and Correction.  It was the command language for the TECO editor, and it was the underlying macro language on which the original version of Emacs was built.  It was an obscure, low-level language whose goal was to perform powerful text editing operations with the minimum number of keystrokes.  A programming language designed to make verification easy would be completely unlike TECO.  The paper shows that you verify a TECO program the same way you verify a program written in a more conventional language. 

    The proof is perhaps also of some historical interest because it was an early example of a proof of an interactive program--that is, one that interacts with the user instead of just producing an answer.  Thus, correctness had to be asserted in terms of the sequence of input actions.  The paper generalizes the Floyd/Hoare method to deal with the history of environment actions. 
     

  28. The Implementation of Reliable Distributed Multiprocess Systems
    Computer Networks 2 (1978), 95-114.
    PDF
    All copyrights reserved by Elsevier Science 1978.
    In [26], I introduced the idea of implementing any distributed system by using an algorithm to implement an arbitrary state machine in a distributed system.  However, the algorithm in [26] assumed that processors never fail and all messages are delivered.  This paper gives a fault-tolerant algorithm.  It's a real-time algorithm, assuming upper bounds on message delays in the absence of faults, and that nonfaulty processes had clocks synchronized to within a known bound. 

    To my knowledge, this is the first published paper to discuss arbitrary failures (later called Byzantine failures).  It actually considered malicious behavior, not using such behavior simply as a metaphor for completely unpredictable failures.  Its algorithm was the inspiration for the digital signature algorithm of [40].  With its use of real-time, this paper presaged the ideas in [54]
     

  29. SIFT: Design and Analysis of a Fault-Tolerant Computer for Aircraft Control  (with John Wensley et al.)
    Proceedings of the IEEE 66, 10 (October 1978), 1240-1255.
    PDF
    When it became clear that computers were going to be flying commercial aircraft, NASA began funding research to figure out how to make them reliable enough for the task.  Part of that effort was the SIFT project at SRI.  This project was perhaps most notable for producing the Byzantine generals problem and its solutions, first reported in [40]

    This paper gives an overview of the complete SIFT project, which included designing the hardware and software and formally verifying the system's correctness.  It announces the results that appear in [40].  It also is a very early example of the basic specification and verification method I still advocate: writing a specification as a state-transition system and showing that each step of the lower-level specification either implements a step of the higher-level one or is a "stuttering" step that leaves the higher-level state unchanged.  The paper doesn't mention the use of an invariant, but that was probably omitted to save space. 

    This paper was a group effort that I choreographed in a final frenzy of activity to get the paper out in time for the issue's deadline.  I don't remember who wrote what, but the section on verification seems to be my writing. 
     

  30. On-the-fly Garbage Collection: an Exercise in Cooperation  (with Edsger Dijkstra et al.)
    Communications of the ACM 21, 11   (November 1978), 966-975.
    PDF
    Copyright © 1978 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    This paper presents the first concurrent garbage collection algorithm--that is, an algorithm in which the collector operates concurrently with the process that creates the garbage.  The paper is fairly well known; its history is not. 

    I received an early version of the paper from Dijkstra, and I made a couple of suggestions.  Dijkstra incorporated them and, quite generously, added me to the list of authors.  He submitted the paper to CACM.  The next I heard about it was when I received a copy of a letter from Dijkstra to the editor withdrawing the paper.  The letter said that someone had found an error in the algorithm, but gave no indication of what the error was.  Since Dijkstra's proof was so convincing, I figured that it must be a trivial error that could easily be corrected. 

    I had fairly recently written [22].  So, I decided to write a proof using that proof method, thinking that I would then find and correct the error.  In about 15 minutes, trying to write the proof led me to the error.  To my surprise, it was a serious error. 

    I had a hunch that the algorithm could be fixed by changing the order in which two operations were performed.  But I had no good reason to believe that would work.  Indeed, I could see no simple informal argument to show that it worked.  However, I decided to go ahead and try to write a formal correctness proof anyway.  It took me about two days of solid work, but I constructed the proof.  When I was through, I was convinced that the algorithm was now correct, but I had no intuitive understanding of why it worked. 

    In the meantime, Dijkstra figured out that the algorithm could be fixed by interchanging two other operations, and he wrote the same kind of behavioral proof as before.  His fix produced an arguably more efficient algorithm than mine, so that's the version we used.  I sketched an assertional proof of that algorithm.  Given the evidence of the unreliability of his style of proof, I tried to get Dijkstra to agree to a rigorous assertional proof.  He was unwilling to do that, though he did agree to make his proof somewhat more assertional and less operational.  Here are his comments on that, written in July, 2000:

    There were, of course, two issues at hand: (A) a witness showing that the problem of on-the-fly garbage collection with fine-grained interleaving could be solved, and (B) how to reason effectively about such artifacts.  I am also certain that at the time all of us were aware of the distinction between the two issues.  I remember very well my excitement when we convinced ourselves that it could be done at all; emotionally it was very similar to my first solutions to the problem of self-stabilization.  Those I published without proofs! It was probably a period in my life that issue (A) in general was still very much in the foreground of my mind: showing solutions to problems whose solvability was not obvious at all.  It was more or less my style.  I had done it (CACM, Sep. 1965) with the mutual exclusion. 

    I, too, have always been most interested in showing that something could be done, rather than in finding a better algorithm for doing what was known to be possible.  Perhaps that is why I have always been so impressed by the brilliance of Dijkstra's work on concurrent algorithms. 

    David Gries later published an Owicki-Gries style proof of the algorithm that was essentially the same as the one I had sketched.  He simplified things a bit by combining two atomic operations into one.  He mentioned that in a footnote, but CACM failed to print the footnote.  (However, they did print the footnote number in the text.) 

    The lesson I learned from this is that behavioral proofs are unreliable and one should always use state-based reasoning for concurrent algorithms--that is, reasoning based on invariance. 
     

  31. A General Construction for Expressing Repetition
    ACM SIGPLAN Notices 14, 3 (March 1979) 38-42.
    PDF
    Copyright © 1979 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    Fortunately, this note has been completely forgotten.  It was written when people were still proposing new programming-language constructs.  The one presented here may have little to recommend it, but it was no worse than many others. 
     

  32. A New Approach to Proving the Correctness of Multiprocess Programs
    ACM Transactions on Programming Languages and Systems 1, 1 (July 1979), 84-97.
    PDF
    Copyright © 1979 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    Like everyone else at the time, when I began studying concurrent algorithms, I reasoned about them behaviorally.  Such reasoning typically involved arguments based on the order in which events occur.  I discovered that proofs can be made simpler, more elegant, and more mathematical by reasoning about operations (which can be composed of multiple events) and two relations on them: precedes (denoted by a solid arrow) and can affect (denoted by a dashed arrow).  Operation A precedes operation B if all the events of A precede all the events of B; and A can affect B if some event in A precedes some event in B.  These relations obey some simple rules that can reduce behavioral reasoning to mathematical symbol pushing. 

    This paper introduced the method of reasoning with the two arrow relations and applied it to a variant of the bakery algorithm.  In the spring of 1976 I spent a month working with Carel Scholten at the Philips Nat Lab in Eindhoven.  Scholten was a colleague and friend of Dijkstra, and the three of us spent one afternoon a week working, talking, and drinking beer at Dijkstra's house.  The algorithm emerged from one of those afternoons.  I think I was its primary author, but as I mention in the paper, the beer and the passage of time made it impossible for me to be sure of who was responsible for what. 

    The solid and dashed arrow formalism provides very elegant proofs for tiny algorithms such as the bakery algorithm.  But, like all behavioral reasoning, it is hard to make completely formal, and it collapses under the weight of a complex problem.  You should use assertional methods to reason about complex algorithms.  However, standard assertional reasoning requires that the algorithm be written in terms of its atomic operations.  The only assertional approach to reasoning directly about nonatomic operations (without translating them into sequences of atomic operations) is the one in [85], which is not easy to use.  The two-arrow formalism is still good for a small class of problems. 

    The formalism seems to have been almost completely ignored, even among the theoretical concurrency community.  I find this ironic.  There is a field of research known by the pretentious name of "true concurrency".  Its devotees eschew assertional methods that are based on interleaving models because such models are not truly concurrent.  Instead, they favor formalisms based on modeling a system as a partial ordering of events, which they feel is the only truly concurrent kind of model.  Yet, those formalisms assume that events are atomic and indivisible.  Atomic events don't overlap in time the way real concurrent operations do.  The two-arrow formalism is the only one I know that is based on nonatomic operations and could therefore be considered truly concurrent.  But, as far as I know, the true concurrency community has paid no attention to it. 
     

  33. How to Present a Paper
    Unpublished note, 4 August 1979.
    Text File
    This three-page note is about presenting a paper at a conference, but it offers good advice for any talk.  Except for a couple of suggestions about hand-written slides, it still applies today. 
     

  34. [IEE1979]How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs
    IEEE Transactions on Computers C-28, 9 (September 1979) 690-691.
    PDF
    I forget what prompted me to be thinking about memory caching, but it occurred to me one day that multiprocessor synchronization algorithms assume that each processor accesses the same word in memory, but each processor actually accesses its own copy in its cache.  It hardly required a triple-digit IQ to realize that this could cause problems.  I suppose what made this paper worth reading was its simple, precise definition of sequential consistency as the required correctness condition.  This was not the first paper about cache coherence.  However, it is the early paper most often cited in the cache-coherence literature. 
     

  35. Constructing Digital Signatures from a One Way Function
    SRI International Technical Report CSL-98 (October 1979).
    No electronic version available.
    At a coffee house in Berkeley around 1975, Whitfield Diffie described a problem to me that he had been trying to solve: constructing a digital signature for a document.  I immediately proposed a solution.  Though not very practical--it required perhaps 64 bits of published key to sign a single bit--it was the first digital signature algorithm.  Diffie and Hellman mention it in their classic paper:

    Whitfield Diffie and Martin E. Hellman.   New Directions in Cryptography.   IEEE Transactions on Information Theory IT-22, 6 (1976), 644-654. 

    (I think it's at the bottom right of page 650.) 

    In 1978, Michael Rabin published a paper titled Digitalized Signatures containing a more practical scheme for generating digital signatures of documents.  (I don't remember what other digital signature algorithms had already been proposed.)  However, his solution had some drawbacks that limited its utility.  This report describes an improvement to Rabin's algorithm that eliminates those drawbacks. 

    I'm not sure why I never published this report.  However, I think it was because, after writing it, I realized that the algorithm could be fairly easily derived directly from Rabin's algorithm.  So, I didn't feel that it added much to what Rabin had done.  However, I've been told that this paper is cited in the cryptography literature and is considered significant, so perhaps I was wrong. 
     

  36. On the Proof of Correctness of a Calendar Program
    Communications of the ACM 22, 10   (October 1979), 554-556.
    PDF
    Copyright © 1979 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    In the May, 1978 CACM, Matthew Geller published a paper titled Test Data as an Aid in Proving Program Correctness.  He argued that there were some programs whose correctness is so hard to state formally that formally verifying them is useless because the specification is likely to be wrong.  He gave as an example a program to compute the number of days between two dates in the same year, claiming that it would be hard to check the correctness of a precise statement of what it meant for the program to be correct.  This paper proved him wrong.  (It also makes the amusing observation that Geller's solution is wrong because it fails for dates before the advent of the Gregorian calendar.)  As a bonus, readers of this paper were alerted well in advance that the year 2000 is a leap year. 
     

  37. Letter to the Editor
    Communications of the ACM 22, 11   (November 1979), 624.
    PDF
    Copyright © 1979 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    In the May, 1979 CACM, De Millo, Lipton, and Perlis published an influential paper titled Social Process and Proofs of Theorems and Programs.  This paper made some excellent observations.  However, by throwing in a few red herrings, they came to some wrong conclusions about program verification.  More insidiously, they framed the debate as one between a reasonable engineering approach that completely ignores verification and a completely unrealistic view of verification advocated only by its most naive proponents.  (There were, unfortunately, quite a few such proponents.)  At that time, some ACM publications had a special section on algorithms.  In an ironic coincidence, the same issue of CACM carried the official ACM policy on algorithm submissions.  It included all sorts of requirements on the form of the code, and even on the comments.  Missing was any requirement that the correctness of the algorithm be demonstrated in any way.  I was appalled at the idea that, ten years after Floyd and Hoare's work on verification, the ACM was willing to publish algorithms with no correctness argument.  The purpose of my letter was to express my dismay.  I ironically suggested that they had succumbed to the arguments of De Millo, Lipton, and Perlis in their policy.  As a result, my letter was published as a rebuttal to the De Millo, Lipton, and Perlis paper.  No-one seems to have taken it for what it was--a plea to alter the ACM algorithms policy to require that there be some argument to indicate that an algorithm worked. 
     

  38. `Sometime' is Sometimes `Not Never'
    Proceedings of the Seventh ACM Symposium on Principles of Programming Languages, ACM SIGACT-SIGPLAN (January 1980).
    PDF
    Copyright © 1980 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    After graduating from Cornell, Susan Owicki joined the faculty of Stanford.  Some time around 1978, she organized a seminar to study the temporal logic that Amir Pnueli had recently introduced to computer science.  I was sure that temporal logic was some kind of abstract nonsense that would never have any practical application, but it seemed like fun, so I attended.  I observed that people got very confused because, in Pnueli's logic, the concepts of always and eventually mean what they do to ordinary people.  In particular, something is not always true if and only if it is eventually false.  (It doesn't always rain means that it eventually stops raining.)  However, most computer scientists have a different way of thinking.  For them, something is not always true if and only if it might possibly become false.  (The program doesn't always produce the right answer means that it might produce the wrong answer.) 

    I realized that there are two types of temporal logic: the one Pnueli used I called linear time logic; the one most computer scientists seemed to find natural I called branching time.  (These terms were used by temporal logicians, but they distinguished the two logics by the axioms they satisfied, while I described them in terms of different kinds of semantics.)  Pnueli chose the right kind of logic--that is, the one that is most useful for expressing properties of concurrent systems.  I wrote this paper to explain the two kinds of logic, and to advocate the use of linear-time logic.  However, I figured that the paper wouldn't be publishable without some real theorems.  So, I proved some simple results demonstrating that the two kinds of logic really are different. 

    I submitted the paper to the Evian Conference, a conference on concurrency held in France to which it seems that everyone working in the field went.  I was told that my paper was rejected because they accepted a paper by Pnueli on temporal logic, and they didn't feel that such an obscure subject merited two papers.  I then submitted the paper to FOCS, where it was also rejected.  I have very rarely resubmitted a paper that has been rejected.  Fortunately, I felt that this paper should be published.  It has become one of the most frequently cited papers in the temporal-logic literature. 
     

  39. The `Hoare Logic' of Concurrent Programs
    Acta Informatica 14, 1 (1980), 21-37.
    No electronic version available.
    As explained in the discussion of
    [22], the Owicki-Gries method and its variants are generalizations of Floyd's method for reasoning about sequential programs.  Hoare's method formalizes Floyd's with a set of axioms for deriving triples of the form {P}S{Q}, which means that if statement S is executed from a state in which P is true and terminates, then Q will be true.  This paper introduces the generalization of Hoare's method to concurrent programs, replacing Hoare triples with assertions of the form {I}S, which means that the individual actions of statement S leave the predicate I invariant. 

    When I wrote this paper, I sent a copy to Tony Hoare thinking that he would like it.  He answered with a letter that said, approximately: "I always thought that the generalization to concurrent programs would have to look something like that; that's why I never did it." (Unfortunately, I no longer have the letter.)  At the time, I dismissed his remark as the ramblings of an old fogey.  I now think he was right--though probably for different reasons than he does.  As I indicated in the discussion of [22], I think Ashcroft was right; one should simply reason about a single global invariant, and not do this kind of decomposition based on program structure. 
     

  40. Reaching Agreement in the Presence of Faults  (with Marshall Pease and Robert Shostak)
    Journal of the Association for Computing Machinery 27, 2 (April 1980).
    PDF
    Copyright © 1980 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    Before this paper, it was generally assumed that a three-processor system could tolerate one faulty processor.  This paper shows that "Byzantine" faults, in which a faulty processor sends inconsistent information to the other processors, can defeat any traditional three-processor algorithm.  (The term Byzantine didn't appear until [45].)  In general, 3n+1 processors are needed to tolerate n faults.  However, if digital signatures are used, 2n+1 processors are enough.  This paper introduced the problem of handling Byzantine faults.  I think it also contains the first precise statement of the consensus problem. 

    I am often unfairly credited with inventing the Byzantine agreement problem.  The problem was formulated by people working on SIFT (see [29]) before I arrived at SRI.  I had already discovered the problem of Byzantine faults and written [28].  (I don't know if that was earlier than or concurrent with its discovery at SRI.)  However, the people at SRI had a much simpler and more elegant statement of the problem than was present in [28]

    The 4-processor solution presented in this paper and the general impossibility result were obtained by Shostak; Pease invented the 3n+1-processor solution.  My contribution to the work in this paper was the solution using digital signatures, which is based on the algorithm in [28].  It was my work on digital signatures (see [35]) that led me to think in that direction.  However, digital signatures, as used here, are a metaphor.  Since the signatures need be secure only against random failure, not against an intelligent adversary, they are much easier to implement than true digital signatures.  However, this point seems to have escaped most people, so they rule out the algorithms that use digital signatures because true digital signature algorithms are expensive.  Thus, 3n+1-processor solutions are used even though there are satisfactory 2n+1-processor solutions,

    My other contribution to this paper was getting it written.  Writing is hard work, and without the threat of perishing, researchers outside academia generally do less publishing than their colleagues at universities.  I wrote an initial draft, which displeased Shostak so much that he completely rewrote it to produce the final version. 

    Over the years, I often wondered whether the people who actually build airplanes know about the problem of Byzantine failures.  In 1997, I received email from John Morgan who used to work at Boeing.  He told me that he came across our work in 1986 and that, as a result, the people who build the passenger planes at Boeing are aware of the problem and design their systems accordingly.  But, in the late 80s and early 90s, the people at Boeing working on military aircraft and on the space station, and the people at McDonnell-Douglas, did not understand the problem.  I have no idea what Airbus knows or when they knew it. 

    This paper was awarded the 2005 Edsger W. Dijkstra Prize in Distributed Computing. 
     

  41. Program Verification: An Approach to Reliable Hardware and Software  (with J S. Moore)
    Transactions of the American Nuclear Society 35 (November 1980), 252-253.
    No electronic version available.
    In 1980, J Moore and I were both at SRI and had been involved in the verification of SIFT (see
    [29]).  The nuclear power industry was, for obvious reasons, interested in the correctness of computer systems.  I forget how it came to pass, but Moore and I were invited to present a paper on verification at some meeting of power industry engineers.  This was the result. 
     

  42. Password Authentication with Insecure Communication
    Communications of the ACM 24, 11   (November 1981), 770-772.
    PDF
    Copyright © 1981 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    Despite a casual interest in civilian cryptography going back to its origins (see the discussion of [35]), this is my only publication in the field.  It presents a cute hack for using a single password to login to a system multiple times without allowing an adversary to gain access to the system by eavesdropping.  This hack is the basis of Bellcore's S/KEY system and of the PayWord system of Rivest and Shamir. 
     

  43. TIMESETS--A New Method for Temporal Reasoning About Programs
    Logics of Programs, Dexter Kozen editor, Springer-Verlag Lecture Notes in Computer Science Volume 131 (1982), 177-196.
    No electronic version available.
    Pnueli's introduction of temporal logic in 1977 led to an explosion of attempts to find new logics for specifying and reasoning about concurrent systems.  Everyone was looking for the silver-bullet logic that would solve the field's problems.  This paper is proof that I was not immune to this fever.  For reasons explained in the discussion of
    [49], it is best forgotten.  Some people may find this of historical interest because it is an early example of an interval logic. 
     

  44. Byzantine Generals and Transaction Commit Protocols  (with Michael Fischer)
    Unpublished (April 1982).
    Postscript   -   Compressed Postscript   -   PDF
    I visited Michael Fischer at Yale in the spring of 1982.  It was known that solutions to the Byzantine generals problem that can handle n Byzantine failures require n+1 rounds of communication.  While I was at Yale, Fischer and I proved that this number of rounds were needed even to handle more benign failures. 

    On the trip back home to California, I got on an airplane at Laguardia Airport in the morning with a snowstorm coming in.  I got off the airplane about eight hours later, still at Laguardia Airport, having written this paper.  I then sent it to Fischer for his comments.  I waited about a year and a half.  By the time he finally decided that he wasn't going to do any more work on the paper, subsequent work by others had been published that superseded it. 
     

  45. The Byzantine Generals Problem  (with Marshall Pease and Robert Shostak)
    ACM Transactions on Programming Languages and Systems 4, 3 (July 1982), 382-401.
    PDF
    Copyright © 1982 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    I have long felt that, because it was posed as a cute problem about philosophers seated around a table, Dijkstra's dining philosopher's problem received much more attention than it deserves.  (For example, it has probably received more attention in the theory community than the readers/writers problem, which illustrates the same principles and has much more practical importance.)  I believed that the problem introduced in [40] was very important and deserved the attention of computer scientists.  The popularity of the dining philosophers problem taught me that the best way to attract attention to a problem is to present it in terms of a story. 

    There is a problem in distributed computing that is sometimes called the Chinese Generals Problem, in which two generals have to come to a common agreement on whether to attack or retreat, but can communicate only by sending messengers who might never arrive.  I stole the idea of the generals and posed the problem in terms of a group of generals, some of whom may be traitors, who have to reach a common decision.  I wanted to assign the generals a nationality that would not offend any readers.  At the time, Albania was a completely closed society, and I felt it unlikely that there would be any Albanians around to object, so the original title of this paper was The Albanian Generals Problem.  Jack Goldberg was smart enough to realize that there were Albanians in the world outside Albania, and Albania might not always be a black hole, so he suggested that I find another name.  The obviously more appropriate Byzantine generals then occurred to me. 

    The main reason for writing this paper was to assign the new name to the problem.  But a new paper needed new results as well.  I came up with a simpler way to describe the general 3n+1-processor algorithm.  (Shostak's 4-processor algorithm was subtle but easy to understand; Pease's generalization was a remarkable tour de force.)  We also added a generalization to networks that were not completely connected.  (I don't remember whose work that was.)  I also added some discussion of practical implementation details. 
     

  46. Proving Liveness Properties of Concurrent Programs  (with Susan Owicki)
    ACM Transactions on Programming Languages and Systems 4, 3 (July 1982), 455-495.
    PDF
    Copyright © 1982 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    During the late 70s and early 80s, Susan Owicki and I worked together quite a bit.  We were even planning to write a book on concurrent program verification.  But this paper is the only thing we ever wrote together. 

    In [22], I had introduced a method of proving eventuality properties of concurrent programs by drawing a lattice of predicates, where arrows from a predicate P to predicates Q1, ..., Qn mean that, if the program reaches a state satisfying P, it must thereafter reach a state satisfying one of the Qi.  That method never seemed practical; formalizing an informal proof was too much work. 

    Pnueli's introduction of temporal logic allowed the predicates in the lattice to be replaced by arbitrary temporal formulas.  This turned lattice proofs into a useful way of proving liveness properties.  It permitted a straightforward formalization of a particularly style of writing the proofs.  I still use this proof style to prove leads-to properties, though the proofs are formalized with TLA (see [101]).  However, I no longer bother drawing pictures of the lattices.  This paper also introduced at, in, and after predicates for describing program control. 

    It's customary to list authors alphabetically, unless one contributed significantly more than the other, but at the time, I was unaware of this custom.  Here is Owicki's account of how the ordering of the authors was determined. 

    As I recall it, you raised the question of order, and I proposed alphabetical order.  You declined--I think you expected the paper to be important and didn't think it would be fair to get first authorship on the basis of a static property of our names.  On the night we finished the paper, we went out to dinner to celebrate, and you proposed that if the last digit of the bill was even (or maybe odd), my name would be first.  And, indeed, that's the way it came out. 


     

  47. An Assertional Correctness Proof of a Distributed Program
    Science of Computer Programming 2, 3 (December 1982), 175-206.
    No electronic version available.
    I showed in
    [26] that there is no invariant way of defining the global state of a distributed system.  Assertional methods, such as [22], reason about the global state.  So, I concluded that these methods were not appropriate for reasoning about distributed systems.  When I wrote this paper, I was at SRI and partly funded by a government contract for which we had promised to write a correctness proof of a distributed algorithm.  I tried to figure out how to write a formal proof without reasoning about the global state, but I couldn't.  The final report was due, so I decided that there was no alternative to writing an assertional proof.  I knew there would be no problem writing such a proof, but I expected that, with its reliance on an arbitrary global state, the proof would be ugly.  To my surprise, I discovered that the proof was quite elegant.  Philosophical considerations told me that I shouldn't reason about global states, but this experience indicated that such reasoning worked fine.  I have always placed more reliance on experience than philosophy, so I have written assertional proofs of distributed systems ever since.  (Others, more inclined to philosophy, have spent decades looking for special ways to reason about distributed systems.) 
     

  48. Reasoning About Nonatomic Operations
    Proceedings of the Tenth ACM Symposium on Principles of Programming Languages, ACM SIGACT-SIGPLAN (January 1983), 28-37.
    PDF
    Copyright © 1983 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    From the time I discovered the bakery algorithm (see [11]), I was fascinated by the problem of reasoning about a concurrent program without having to break it into indivisible atomic actions.  In [32], I described how to do this for behavioral reasoning.  But I realized that assertional reasoning, as described in [22], was the only proof method that could scale to more complex problems.  This paper was my first attempt at assertional reasoning about nonatomic operations.  It introduces the win (weakest invariant) operator that later appeared in [85], but using the notation of Pratt's dynamic logic rather than Dijkstra's predicate transformers. 

    I have in my files a letter from David Harel, who was then an editor of Information and Control, telling me that the paper was accepted by the journal, after revision to satisfy some concerns of the referees.  I don't remember why I didn't submit a revised version.  I don't think I found the referees' requests unreasonable.  It's unlikely that I abandoned the paper because I had already developed the method in [85], since that didn't appear as a SRC research report until four years later.  Perhaps I was just too busy. 
     

  49. Specifying Concurrent Program Modules
    ACM Transactions on Programming Languages and Systems 5, 2 (April 1983), 190-222.
    PDF
    Copyright © 1983 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    The early methods for reasoning about concurrent programs dealt with proving that a program satisfied certain properties--usually invariance properties.  But, proving particular properties showed only that the program satisfied those properties.  There remained the possibility that the program was incorrect because it failed to satisfy some other properties.  Around the early 80s, people working on assertional verification began looking for ways to write a complete specification of a system.  A specification should say precisely what it means for the system to be correct, so that if we prove that the system meets its specification, then we can say that the system really is correct.  (Process algebraists had already been working on that problem since the mid-70s, but there was--and I think still is--little communication between them and the assertional verification community.) 

    At SRI, we were working on writing temporal logic specifications.  One could describe properties using temporal logic, so it seemed very natural to specify a system by simply listing all the properties it must satisfy.  Richard Schwartz, Michael Melliar-Smith, and I collaborated on a paper titled Temporal Logic Specification of Distributed Systems, which was published in the Proceedings of the 2nd International Conference on Distributed Computing Systems, held in Paris in 1981.  However, I became disillusioned with temporal logic when I saw how Schwartz, Melliar-Smith, and Fritz Vogt were spending days trying to specify a simple FIFO queue--arguing over whether the properties they listed were sufficient.  I realized that, despite its aesthetic appeal, writing a specification as a conjunction of temporal properties just didn't work in practice.  So, I had my name removed from the paper before it was published, and I set about figuring out a practical way to write specifications.  I came up with the approach described in this paper, which I later called the transition axiom method.  Schwartz stopped working on specification and verification in the mid-80s.  He wrote recently (in June 2000):

    [T]he same frustration with the use of temporal logic led Michael, Fritz Vogt and me to come up with Interval Logic as a higher level model in which to express time-ordered properties of events.  [See [43].]  As you recall, interval logic expressed constraints forward and backward around significant events in order to more closely capture the way that people describe event-driven behavior.  Ultimately, I remain unsatisfied with any of our attempts, from the standpoint of reaching practical levels. 

    This paper is the first place I used the idea of describing a state transition as a boolean-valued function of primed and unprimed variables.  However, by the early 80s, the idea must have been sufficiently obvious that I didn't claim any novelty for it, and I forgot that I had even used it in this paper until years later (see the discussion of [101]). 
     

  50. Specification and Proof of a Fault-Tolerant Real-Time Algorithm
    In Highly Dependable Distributed Systems, final report for SRI Project 4180 (Contract Number DAEA18-81-G-0062) (June 1983).
    No electronic version available.
    In the spring of 1983, I was called upon to contribute a chapter for the final report on a project at SRI.  I chose to write a specification and correctness proof of a Byzantine general's algorithm--a distributed, real-time algorithm.  (Nonfaulty components must satisfy real-time constraints, and the correctness of the algorithm depends on these constraints.)  I began the exercise on a Wednesday morning.  By noon that Friday, I had the final typeset output.  I presume there are lots of errors; after finishing it, I never reread it carefully and I have no indication that anyone else did either.  But, I have no reason to doubt the basic correctness of the proof.  I never published this paper because it didn't seem worth publishing.  The only thing I find remarkable about it is that so many computer scientists are unaware that, even in 1983, writing a formal correctness proof of a distributed real-time algorithm was an unremarkable feat. 
     

  51. The Weak Byzantine Generals Problem
    Journal of the Association for Computing Machinery 30, 3 (July 1983), 668-676.
    PDF
    Copyright © 1983 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    This paper introduces a weaker version of the Byzantine generals problem described in [40].  The problem is "easier" because there exist approximate solutions with fewer than 3n processes that can tolerate n faults, something shown in [40] to be impossible for the original Byzantine generals problem.  I don't remember how I came to consider this problem. 
     

  52. PHIL: A Semantic Structural Graphical Editor  (with Joseph Goguen)
    SRI International Technical Report (August 1983).
    No electronic version available.
    SRI had a contract with Philips to design a graphical editor for structured documents (such as programs).  Goguen and I were the prime instigators and principle investigators of the project.  This is the project's final report.  Rindert Schutten of Philips visited SRI and implemented a very preliminary version.  I felt that our design was neither novel enough to constitute a major contribution nor modest enough to be the basis for a practical system at that time, and I thought the project had been dropped.  However, Goguen informed me much later that some version of the system was still being used in the early 90s, and that it had evolved into a tool for VLSI layout, apparently called MetaView. 
     

  53. What Good Is Temporal Logic?
    Information Processing 83, R. E. A. Mason, ed., Elsevier Publishers (1983), 657-668.
    PDF
    This was an invited paper.  It describes the state of my views on specification and verification at the time.  It is notable for introducing the idea of invariance under stuttering and explaining why it's a vital attribute of a specification logic.  It is also one of my better-written papers. 
     

  54. Using Time Instead of Timeout for Fault-Tolerant Distributed Systems
    ACM Transactions on Programming Languages and Systems 6, 2 (April 1984), 254-280.
    PDF
    Copyright © 1984 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    The genesis of this paper was my realization that, in a multiprocess system with synchronized clocks, the absence of a message can carry information.  I was fascinated by the idea that a process could communicating zillions of bits of information by not sending messages.  The practical implementation of Byzantine generals algorithms described in [45] could be viewed as an application of this idea.  I used the idea as something of a gimmick to justify the paper.  The basic message of this paper should have been pretty obvious: the state machine approach, introduced in [26], allows us to turn any consensus algorithm into a general method for implementing distributed systems; the Byzantine generals algorithms of [45] were fault-tolerant consensus algorithms; hence, we had fault-tolerant implementations of arbitrary distributed systems.  I published the paper because I had found few computer scientists who understood this. 
     

  55. The Hoare Logic Of CSP, and All That  (with Fred Schneider)
    ACM Transactions on Programming Languages and Systems 6, 2 (April 1984), 281-296.
    PDF
    Copyright © 1984 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    I felt that in [39], I had presented the right way to do assertional (also known as Owicki-Gries style) reasoning about concurrent programs.  However, many people were (and perhaps still are) hung up on the individual details of different programming languages and are unable to understand that the same general principles apply to all of them.  In particular, people felt that "distributed" languages based on rendezvous or message passing were fundamentally different from the shared-variable language that was considered in [39].  For example, some people made the silly claim that the absence of shared variables made it easier to write concurrent programs in CSP than in more conventional languages.  (My response is the equally silly assertion that it's harder to write concurrent programs in CSP because the control state is shared between processors.) 

    Schneider agreed with me that invariance was the central concept in reasoning about concurrent programs.  He was also an expert on all the different flavors of message passing that had been proposed.  We demonstrated in this paper that the basic approach of [39] worked just was well with CSP; and we claimed (without proof) that it also worked in other "distributed" languages.  I found it particularly funny that we should be the ones to give a Hoare logic to CSP, while Hoare was using essentially behavioral methods to reason about CSP programs.  I'm still waiting for the laughter. 
     

  56. Byzantine Clock Synchronization  (with Michael Melliar-Smith)
    Proceedings of the Third Annual ACM Symposium on Principles of Distributed Computing (August, 1984), 68-74.
    PDF
    Copyright © 1984 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    This is the preliminary conference version of [61]
     

  57. Solved Problems, Unsolved Problems and NonProblems in Concurrency
    Proceedings of the Third Annual ACM Symposium on Principles of Distributed Computing (August, 1984) 1-11.
    PDF
    Copyright © 1984 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.
    This is the invited address I gave at the 1983 PODC conference, which I transcribed from a tape recording of my presentation.  The first few minutes of the talk were not taped, so I had to reinvent the beginning.  This talk is notable because it marked the rediscovery by the computer science community of Dijkstra's 1974 CACM paper that introduced the concept of self-stabilization.  A self-stabilizing system is one that, when started in any state, eventually "rights itself" and operates correctly.  The importance of self-stabilization to fault tolerance was obvious to me and a handful of people, but went completely over the head of most readers.  Dijkstra's paper gave little indication of the practical significance of the problem, and few people understood its importance.  So, this gem of a paper had disappeared without a trace by 1983.  My talk brought Dijkstra's paper to the attention of the PODC community, and now self-stabilization is a regular subfield of distributed computing.  I regard the resurrection of Dijkstra's brilliant work on self-stabilization to be one of my greatest contributions to computer science. 

    The paper contains one figure--copied directly from a transparency--with an obviously bogus algorithm.  I tried to recreate an algorithm from memory and wrote complete nonsense.  It's easy to make such a mistake when drawing a transparency, and I probably didn't bother to look at it when I prepared the paper.  To my knowledge, it is the only incorrect algorithm I have published. 
     

  58. On a "Theorem" of Peterson
    Unpublished (October, 1984).
    Postscript   -   Compressed Postscript   -   PDF
    This three-page note gives an example that appears to contradict a theorem in a TOPLAS article by Gary Peterson.  Whether or not it does depends on the interpretation of the statement of the theorem, which is given only informally in English.  I draw the moral that greater rigor is needed.  When I sent this