This document contains descriptions of almost all my technical papers and electronic versions of many of them for downloading. Omitted are papers for which I no longer have copies and papers that are incomplete. I have also omitted early versions of some of these papers--even in cases where the title changed. Included are some initial drafts of papers that I abandoned before fixing errors or other problems in them. A table of contents precedes the descriptions. I also include a brief curriculum vitae. A printable version of this document is available as a postscript, gzipped postscript, or pdf file.
Each description attempts to explain the genesis of the work. However, I have forgotten how I came to write most of my papers. Even when I discourse at length about the development of the ideas, I am giving only a subjective view based on my unreliable memory. Whenever possible, I have asked other people involved to check the accuracy of what I've written. However, what I have most often forgotten is the work of others that influenced my own work. This may give the impression that I am claiming more credit for ideas than I deserve, for which I apologize.
Where I think it's interesting, I give the story behind the publication or non-publication of a paper. Some of the stories read like complaints of unfair treatment by editors or referees. Such cases are bound to arise in any activity based on human judgment. On the whole, I have had little trouble getting my papers published. In fact, I have profited from the natural tendency of editors and referees to be less critical of the work of established scientists. But I think it's worth mentioning the cases where the system didn't work as it should.
I would like to have ordered my papers by the date they were written. However, I usually have no record of when I actually wrote something. So, I have ordered them by date of publication or, for unpublished works, by the date of the version that I have. Because of long and variable publication delays, this means that the order is only approximately chronological.
Whenever possible, I have included electronic versions of the works. At the moment, I have electronic versions mainly of works written after about 1985. For journal articles, these may be "preprint" versions, formatted differently and sometimes differing slightly from the published versions. I hope in the future to provide scanned versions of earlier publications.
This page can be found by
searching the Web for the 23-letter string
alllamportspubsontheweb.
Please do not put this string in any document that might appear on the
Web-including email messages and Postscript, PDF, and Word
documents. One way to refer to it in Web documents is "the
string obtained by removing the - characters from the string
alllam-portspu-bsonth-eweb."
Anyway, this paper was my first pass at writing up the complete
version of the theory for publication. I strongly suspect that it has
never been read. No-one seems to have noticed that, because of a
text-editing error, the description of the algorithm is missing the
punch line that says what can be executed in parallel. This paper is
superseded by the unpublished [19].
To justify my attendance at Sagamore, I always submitted a paper.
But once I discovered that it was not a first-rate conference, I did not
submit first-rate papers. However, I don't republish old material
(except as necessary to avoid forcing people to read earlier papers),
so this paper must have included some new results about the hyperplane
method for parallelizing sequential loops. However, I can't find a
copy of the paper and don't remember what was in it.
I don't know how many people realize how remarkable this
algorithm is. Perhaps the person who realized it better than anyone
is Anatol Holt, a former colleague at Massachusetts Computer
Associates. When I showed him the algorithm and its proof and pointed
out its amazing property, he was shocked. He refused to believe it
could be true. He could find nothing wrong with my proof, but he was
certain there must be a flaw. He left that night determined to find
it. I don't know when he finally reconciled himself to the
algorithm's correctness.
Several books have included emasculated versions of the algorithm in
which reading and writing are atomic operations, and called those
versions "the bakery algorithm". I find that deplorable. There's
nothing wrong with publishing a simplified version, as long as it's
called a simplified version.
What is significant about the bakery algorithm is that it implements
mutual exclusion without relying on any lower-level mutual exclusion.
Assuming that reads and writes of a memory location are atomic
actions, as previous mutual exclusion algorithms had done, is
tantamount to assuming mutually exclusive access to the location. So
a mutual exclusion algorithm that assumes atomics reads and writes is
assuming lower-level mutual exclusion. Such an algorithm cannot
really be said to solve the mutual exclusion problem. Before the
bakery algorithm, people believed that the mutual exclusion problem
was unsolvable--that you could implement mutual exclusion only by
using lower-level mutual exclusion. Brinch Hansen said exactly this
in a 1972 paper. Many people apparently still believe it. (See
[90].)
The paper itself does not state that it is a "true" mutual exclusion
algorithm. This suggests that I didn't realize the full significance
of the algorithm until later. But I don't remember.
For a couple of years after my discovery of the bakery algorithm,
everything I learned about concurrency came from studying it. Papers
like [24], [32], and [69]
were direct results of that study. The bakery algorithm was also
where I introduced the idea of variables belonging to a process--that
is, variables that could be read by multiple processes, but written by
only a single process. I was aware from the beginning that such
algorithms had simple distributed implementations, where the variable
resides at the owning process, and other processes read it by sending
messages to the owner. Thus, the bakery algorithm marked the
beginning of my study of distributed algorithms.
The paper contains one small but significant error. In a
footnote, it claims that we can consider reads and writes of a single
bit to be atomic. It argues that a read overlapping a write must
get one of the two possible values; if it gets the old value, we can
consider the read to have preceded the write, otherwise to have
followed it. It was only later, with the work eventually described in
[69], that I realized the fallacy in this reasoning.
The note contains the intriguing sentence: "There is a complicated
modified version of the bakery algorithm in which the values of all
variables are bounded." I never wrote down that version, and I'm not
sure what I had in mind. But I think I was thinking of roughly the
following modification. As a process waits to enter its critical
section, it keeps reducing its number, not entering the critical
section until its number equals one. A process p can reduce
its number by at most one, and only when the next lower-numbered
process's number is at least two less than p's number, and the
next higher-numbered process is within one of p's number. I
think I intended to use the techniques of [24] to allow
reading and writing of numbers to remain non-atomic while maintaining
the order of waiting processes. (If eventually all processes stop
changing their numbers, then all processes will eventually read the
correct numbers, allowing some process to progress.) At one time, I
convinced myself that this algorithm is correct. But I never wrote a
rigorous proof, so I don't know if it really works. Filling in the
details and proving correctness should be a nice exercise.
A referee said of the initial submission that the proofs were too
long, tediously proving the obvious. In fact, there was a bug in the
initial version, and at least one of those obvious proofs was of a
false statement. Needless to say, I corrected the algorithm and wrote
more careful proofs.
The basic proof that an arbiter cannot have a bounded
response time uses continuity to demonstrate that, if there are two
inputs that can drive a flip-flop into two different states, then
there must exist an input that makes the flip-flop hang. At the time,
it was very difficult to convince someone that this argument was
valid. They seemed to believe that, because a flip-flop has only
discrete stable states, continuity doesn't apply.
I described the arbiter problem to Palais, who had been my de
jure thesis adviser and afterwards became a colleague and a friend.
He recognized that the correct mathematical way to view what was going
on is in terms of the compact-open topology on the space of flip-flop
behaviors. So, we wrote this paper to explain why the apparently
discontinuous behavior of an arbiter is actually continuous in the
appropriate topology.
This paper was rejected by the IEEE Transactions on Computers
because the engineers who reviewed it couldn't understand the
mathematics. Six years later, the journal apparently acquired more
mathematically sophisticated reviewers, and it published a less
general result with a more complicated proof. I believe someone has
finally published a paper on the subject that does supersede ours.
In the course of my work on parallelizing sequential code (see
[9]), I essentially rediscovered Floyd's method as a way
of extracting properties of a program. When I showed a colleague what
I was doing, he went to our library at Massachusetts Computer
Associates and gave me a copy of the original tech report version of
Floyd's classic paper Assigning Meanings to Programs. I don't
remember when I read Hoare's An Axiomatic Basis for Computer
Programming, but it was probably not long afterwards.
In the mid-70s, several people were thinking about the problem of
verifying concurrent programs. The seminal paper was Ed Ashcroft's
Proving Assertions About Parallel Programs, published in the
Journal of Computer and System Sciences in 1975. That paper
introduced the fundamental idea of invariance. I discovered how to
use the idea of invariance to generalize Floyd's method to
multiprocess programs. As is so often the case, in retrospect the
idea seems completely obvious. However, it took me a while to come to
it. I remember that, at one point, I thought that a proof would
require induction on the number of processes.
This paper introduced the concepts of safety and liveness as the
proper generalizations of partial correctness and termination to
concurrent programs. It also introduced the terms "safety" and
"liveness" for those classes of properties. I stole those terms
from Petri nets, where they have similar but formally very different
meanings. (Safety of a Petri net is a particular safety property;
liveness of a Petri net is not a liveness property.)
At the same time I was devising my method, Susan Owicki was writing
her thesis at Cornell under David Gries and coming up with very much
the same ideas. The generalization of Floyd's method for proving
safety properties of concurrent programs became known as the
Owicki-Gries method. Owicki and Gries did not do anything comparable
to the method for proving liveness in my paper. (Nissim Francez and
Amir Pnueli developed a general proof method that did handle liveness
properties, but it lacked a nice way of proving invariance
properties.) My method had deficiencies that were corrected with the
introduction of temporal logic, discussed in [46].
The Owicki-Gries version of the method for proving safety properties
differed from mine in two ways. The significant way was that I made
the control state explicit, while they had no way to talk about it
directly. Instead, they introduced dummy variables to capture the
control state. The insignificant way was that I used a flowchart
language while they used an Algol-like language.
The insignificant syntactic difference in the methods turned out to
have important ramifications. For writing simple concurrent
algorithms, flowcharts are actually better than conventional toy
programming languages because they make the atomic actions, and hence
the control state, explicit. However, by the mid-70s, flowcharts were
passé and structured programming was all the rage,
so my paper was forgotten and people read only theirs. The paper was
rediscovered about ten years later, and is now generally cited
alongside theirs in the mandatory references to previous work.
More important though is that, because they had used a "structured"
language, Owicki and Gries thought that they had generalized Hoare's
method. To the extent that Floyd's and Hoare's methods are different,
it is because Hoare's method is based on the idea of hierarchical
decomposition of proofs. The Owicki-Gries method doesn't permit this
kind of clean hierarchical decomposition. Gries, commenting in 1999,
said: "We hardly ever looked at Floyd's work and simply did
everything based on Hoare's axiomatic theory." I suspect that,
because they weren't thinking at all about Floyd's approach, they
didn't notice the difference between the two, and thus they didn't
realize that they were generalizing Floyd's method and not Hoare's.
The result of presenting a generalization of Floyd's method in Hoare's
clothing was to confuse everyone. For a period of about ten years,
hardly anyone really understood that the Owicki-Gries method was just
a particular way of structuring the proof of a global invariant. I
can think of no better illustration of this confusion than the
EWD
A Personal Summary of the Owicki-Gries
Theory that Dijkstra wrote and subsequently published in a book of his
favorite EWDs.( Throughout his career, Edsger Dijkstra
wrote a series of notes identified by an "EWD" number.)
If even someone as smart and generally clear-thinking as Dijkstra
could write such a confusing hodge-podge of an explanation, imagine
how befuddled others must have been. A true generalization of Hoare's
method to concurrent programs didn't come until several years later in
[39].
I think it soon became evident that one wanted to talk explicitly
about the control state. Susan Owicki obviously agreed, since we
introduced the at, in, and after predicates for
doing just that in [46]. Quite a bit later, I had more
to say about dummy variables versus control state in [77].
Dummy variables were more than just an ugly hack to avoid control
variables. They also allowed you to capture history. Adding history
variables makes it possible to introduce behavioral reasoning into an
assertional proof. (In the limit, you can add a variable that
captures the entire history and clothe a completely behavioral proof
in an assertional framework.) What a program does next depends on its
current state, not on its history. Therefore, a proof that is based
on a history variable doesn't capture the real reason why a program
works. I've always found that proofs that don't use history variables
teach you more about the algorithm. (As shown in
[91], history variables may be necessary if the
correctness conditions themselves are in terms of history.)
When we developed our methods, Owicki and I and most everyone else
thought that the Owicki-Gries method was a great improvement over
Ashcroft's method because it used the program text to decompose the
proof. I've since come to realize that this was a mistake. It's
better to write a global invariant. Writing the invariant as an
annotation allows you to hide some of the explicit dependence of the
invariant on the control state. However, even if you're writing your
algorithm as a program, more often than not, writing the invariant as
an annotation rather than a single global invariant makes things more
complicated. But even worse, an annotation gets you thinking in terms
of separate assertions rather than in terms of a single global
invariant. And it's the global invariant that's important. Ashcroft
got it right. Owicki and Gries and I just messed things up. It took
me quite a while to figure this out.
Sometime during the '80s, Jay Misra noticed that the definition
of well-foundedness (Definition 8 on page 136) is obviously incorrect.
The origin of this paper was a note titled The Maintenance of
Duplicate Databases by Paul Johnson and Bob Thomas. I believe their
note introduced the idea of using message timestamps in a distributed
algorithm. I happen to have a solid, visceral understanding of
special relativity (see [4]). This
enabled me to grasp immediately the essence of what they were trying
to do. Special relativity teaches us that there is no invariant total
ordering of events in space-time; different observers can disagree
about which of two events happened first. There is only a partial
order in which an event e1 precedes an event
e2 iff e1 can causally
affect e2. I realized that the essence of
Johnson and Thomas's algorithm was the use of timestamps to provide a
total ordering of events that was consistent with the causal order.
This realization may have been brilliant. Having realized it,
everything else was trivial. Because Thomas and Johnson didn't
understand exactly what they were doing, they didn't get the algorithm
quite right; their algorithm permitted anomalous behavior that
essentially violated causality. I quickly wrote a short note pointing
this out and correcting the algorithm.
It didn't take me long to realize that an algorithm for
totally ordering events could be used to implement any distributed
system. A distributed system can be described as a particular
sequential state machine that is implemented with a network of
processors. The ability to totally order the input requests leads
immediately to an algorithm to implement an arbitrary state machine by
a network of processors, and hence to implement any distributed
system. So, I wrote this paper, which is about how to implement an
arbitrary distributed state machine. As an illustration, I used the
simplest example of a distributed system I could think of--a
distributed mutual exclusion algorithm.
This is my most often cited paper. Many computer scientists claim to
have read it. But I have rarely encountered anyone who was
aware that the paper said anything about state machines. People seem to
think that it is about either the causality relation on events in a
distributed system, or the distributed mutual exclusion problem.
People have insisted that there is nothing about state machines in the
paper. I've even had to go back and reread it to convince myself that
I really did remember what I had written.
The paper describes the synchronization of logical clocks. As
something of an afterthought, I decided to see what kind of
synchronization it provided for real-time clocks. So, I included a
theorem about real-time synchronization. I was rather surprised by
how difficult the proof turned out to be. This was an indication of
what lay ahead in [61].
This paper won the 2000 PODC Influential Paper Award (later
renamed the Edsger W. Dijkstra Prize in Distributed Computing).
It won an ACM SIGOPS Hall of Fame Award in 2007.
TECO stands for Text Editing and Correction. It was the command
language for the TECO editor, and it was the underlying macro language
on which the original version of Emacs was built. It was an obscure,
low-level language whose goal was to perform powerful text editing
operations with the minimum number of keystrokes. A programming
language designed to make verification easy would be completely unlike
TECO. The paper shows that you verify a TECO program the
same way you verify a program written in a more conventional language.
The proof is perhaps also of some historical interest because it was an
early example of a proof of an interactive program--that is, one that
interacts with the user instead of just producing an answer. Thus,
correctness had to be asserted in terms of the sequence of input
actions. The paper generalizes the Floyd/Hoare method to deal with
the history of environment actions.
To my knowledge, this is the first published paper to discuss
arbitrary failures (later called Byzantine failures). It actually
considered malicious behavior, not using such behavior simply as a
metaphor for completely unpredictable failures. Its algorithm was the
inspiration for the digital signature algorithm of [40].
With its use of real-time, this paper presaged the ideas in
[54].
This paper gives an overview of the complete SIFT project,
which included designing the hardware and software and formally
verifying the system's correctness. It announces the results that
appear in [40]. It also is a very early example of the
basic specification and verification method I still advocate: writing
a specification as a state-transition system and showing that each
step of the lower-level specification either implements a step of the
higher-level one or is a "stuttering" step that leaves the
higher-level state unchanged. The paper doesn't mention the use of an
invariant, but that was probably omitted to save space.
This paper was a group effort that I choreographed in a final frenzy
of activity to get the paper out in time for the issue's deadline. I
don't remember who wrote what, but the section on verification seems
to be my writing.
I received an early version of the paper from Dijkstra, and I made a
couple of suggestions. Dijkstra incorporated them and, quite
generously, added me to the list of authors. He submitted the paper
to CACM. The next I heard about it was when I received a
copy of a letter from Dijkstra to the editor withdrawing the paper.
The letter said that someone had found an error in the algorithm, but
gave no indication of what the error was. Since Dijkstra's proof was
so convincing, I figured that it must be a trivial error that could
easily be corrected.
I had fairly recently written [22]. So, I decided to
write a proof using that proof method, thinking that I would then find
and correct the error. In about 15 minutes, trying to write the proof
led me to the error. To my surprise, it was a serious error.
I had a hunch that the algorithm could be fixed by changing the order
in which two operations were performed. But I had no good reason to
believe that would work. Indeed, I could see no simple informal
argument to show that it worked. However, I decided to go ahead and
try to write a formal correctness proof anyway. It took me about two
days of solid work, but I constructed the proof. When I was through,
I was convinced that the algorithm was now correct, but I had no
intuitive understanding of why it worked.
In the meantime, Dijkstra figured out that the algorithm
could be fixed by interchanging two other operations, and he wrote the
same kind of behavioral proof as before. His fix produced an arguably
more efficient algorithm than mine, so that's the version we used. I
sketched an assertional proof of that algorithm. Given the evidence
of the unreliability of his style of proof, I tried to get Dijkstra to
agree to a rigorous assertional proof. He was unwilling to do that,
though he did agree to make his proof somewhat more assertional and
less operational. Here are his comments on that, written in July,
2000:
There were, of course, two issues at hand: (A) a witness showing that
the problem of on-the-fly garbage collection with fine-grained
interleaving could be solved, and (B) how to reason effectively about
such artifacts. I am also certain that at the time all of us were
aware of the distinction between the two issues. I remember very well
my excitement when we convinced ourselves that it could be done at
all; emotionally it was very similar to my first solutions to the
problem of self-stabilization. Those I published without proofs! It
was probably a period in my life that issue (A) in general was still
very much in the foreground of my mind: showing solutions to problems
whose solvability was not obvious at all. It was more or less my
style. I had done it (CACM, Sep. 1965) with the mutual
exclusion.
David Gries later published an Owicki-Gries style proof of the
algorithm that was essentially the same as the one I had sketched. He
simplified things a bit by combining two atomic operations into one.
He mentioned that in a footnote, but CACM failed to print the
footnote. (However, they did print the footnote number in the text.)
The lesson I learned from this is that behavioral proofs are
unreliable and one should always use state-based reasoning for
concurrent algorithms--that is, reasoning based on invariance.
This paper introduced the method of reasoning with the two arrow
relations and applied it to a variant of the bakery algorithm. In the
spring of 1976 I spent a month working with Carel Scholten at the
Philips Nat Lab in Eindhoven. Scholten was a colleague and
friend of Dijkstra, and the three of us spent one afternoon a week
working, talking, and drinking beer at Dijkstra's house. The
algorithm emerged from one of those afternoons. I think I was its
primary author, but as I mention in the paper, the beer and the
passage of time made it impossible for me to be sure of who was
responsible for what.
The solid and dashed arrow formalism provides very elegant proofs for
tiny algorithms such as the bakery algorithm. But, like all
behavioral reasoning, it is hard to make completely formal, and it
collapses under the weight of a complex problem. You should use
assertional methods to reason about complex algorithms. However,
standard assertional reasoning requires that the algorithm be written
in terms of its atomic operations. The only assertional approach to
reasoning directly about nonatomic operations (without translating
them into sequences of atomic operations) is the one in
[85], which is not easy to use. The two-arrow
formalism is still good for a small class of problems.
The formalism seems to have been almost completely ignored, even among
the theoretical concurrency community. I find this ironic. There is
a field of research known by the pretentious name of "true
concurrency". Its devotees eschew assertional methods that are based
on interleaving models because such models are not truly concurrent.
Instead, they favor formalisms based on modeling a system as a partial
ordering of events, which they feel is the only truly concurrent kind
of model. Yet, those formalisms assume that events are atomic and
indivisible. Atomic events don't overlap in time the way real
concurrent operations do. The two-arrow formalism is the only one I
know that is based on nonatomic operations and could therefore be
considered truly concurrent. But, as far as I know, the true
concurrency community has paid no attention to it.
Whitfield Diffie and Martin E. Hellman.
New Directions in Cryptography.
IEEE Transactions on Information Theory IT-22, 6
(1976), 644-654.
In 1978, Michael Rabin published a paper titled Digitalized
Signatures containing a more practical scheme for generating digital
signatures of documents. (I don't remember what other digital
signature algorithms had already been proposed.) However, his
solution had some drawbacks that limited its utility. This report
describes an improvement to Rabin's algorithm that eliminates those
drawbacks.
I'm not sure why I never published this report. However, I think it
was because, after writing it, I realized that the algorithm could be
fairly easily derived directly from Rabin's algorithm. So, I didn't
feel that it added much to what Rabin had done. However, I've been
told that this paper is cited in the cryptography literature and is
considered significant, so perhaps I was wrong.
I realized that there are two types of temporal logic: the one Pnueli
used I called linear time logic; the one most computer
scientists seemed to find natural I called branching time.
(These terms were used by temporal logicians, but they distinguished
the two logics by the axioms they satisfied, while I described them in
terms of different kinds of semantics.) Pnueli chose the right kind
of logic--that is, the one that is most useful for expressing
properties of concurrent systems. I wrote this paper to explain the
two kinds of logic, and to advocate the use of linear-time logic.
However, I figured that the paper wouldn't be publishable without some
real theorems. So, I proved some simple results demonstrating that
the two kinds of logic really are different.
I submitted the paper to the Evian Conference, a conference on
concurrency held in France to which it seems that everyone working in
the field went. I was told that my paper was rejected because they
accepted a paper by Pnueli on temporal logic, and they didn't feel
that such an obscure subject merited two papers. I then submitted the
paper to FOCS, where it was also rejected. I have very rarely
resubmitted a paper that has been rejected. Fortunately, I felt that
this paper should be published. It has become one of the most
frequently cited papers in the temporal-logic literature.
When I wrote this paper, I sent a copy to Tony Hoare thinking that he
would like it. He answered with a letter that said, approximately:
"I always thought that the generalization to concurrent programs
would have to look something like that; that's why I never did it."
(Unfortunately, I no longer have the letter.) At the time, I
dismissed his remark as the ramblings of an old fogey. I now think he
was right--though probably for different reasons than he does. As I
indicated in the discussion of [22], I think Ashcroft was
right; one should simply reason about a single global invariant, and
not do this kind of decomposition based on program structure.
I am often unfairly credited with inventing the Byzantine
agreement problem. The problem was formulated by people working on
SIFT (see [29]) before I arrived at SRI. I had
already discovered the problem of Byzantine faults and written
[28]. (I don't know if that was earlier than or
concurrent with its discovery at SRI.) However, the people
at SRI had a much simpler and more elegant statement of the problem
than was present in [28].
The 4-processor solution presented in this paper and the general
impossibility result were obtained by Shostak; Pease invented the
3
My other contribution to this paper was getting it written. Writing
is hard work, and without the threat of perishing, researchers outside
academia generally do less publishing than their colleagues at
universities. I wrote an initial draft, which displeased Shostak so
much that he completely rewrote it to produce the final version.
Over the years, I often wondered whether the people who actually build
airplanes know about the problem of Byzantine failures. In 1997, I
received email from John Morgan who used to work at Boeing. He told
me that he came across our work in 1986 and that, as a result, the
people who build the passenger planes at Boeing are aware of the
problem and design their systems accordingly. But, in the late 80s
and early 90s, the people at Boeing working on military aircraft and
on the space station, and the people at McDonnell-Douglas, did not
understand the problem. I have no idea what Airbus knows or when
they knew it.
This paper was awarded the 2005 Edsger W. Dijkstra Prize in
Distributed Computing.
On the trip back home to California, I got on an airplane at Laguardia
Airport in the morning with a snowstorm coming in. I got off the
airplane about eight hours later, still at Laguardia Airport, having
written this paper. I then sent it to Fischer for his comments. I
waited about a year and a half. By the time he finally decided that
he wasn't going to do any more work on the paper, subsequent work by
others had been published that superseded it.
There is a problem in distributed computing that is
sometimes called the Chinese Generals Problem, in which two generals
have to come to a common agreement on whether to attack or retreat,
but can communicate only by sending messengers who might never arrive.
I stole the idea of the generals and posed the problem in terms of a
group of generals, some of whom may be traitors, who have to reach a
common decision. I wanted to assign the generals a nationality that
would not offend any readers. At the time, Albania was a completely
closed society, and I felt it unlikely that there would be any
Albanians around to object, so the original title of this paper was
The Albanian Generals Problem. Jack Goldberg was smart enough
to realize that there were Albanians in the world outside Albania, and
Albania might not always be a black hole, so he suggested that I find
another name. The obviously more appropriate Byzantine generals then
occurred to me.
The main reason for writing this paper was to assign the new name to
the problem. But a new paper needed new results as well. I came up
with a simpler way to describe the general
3
In [22], I had introduced a method of proving eventuality
properties of concurrent programs by drawing a lattice of predicates,
where arrows from a predicate P to predicates Q1, ..., Qn mean that, if the
program reaches a state satisfying P, it must thereafter reach
a state satisfying one of the Qi. That method
never seemed practical; formalizing an informal proof was too much
work.
Pnueli's introduction of temporal logic allowed the predicates in the
lattice to be replaced by arbitrary temporal formulas. This turned
lattice proofs into a useful way of proving liveness properties. It
permitted a straightforward formalization of a particularly style of
writing the proofs. I still use this proof style to prove leads-to
properties, though the proofs are formalized with TLA (see
[101]). However, I no longer bother drawing
pictures of the lattices. This paper also introduced at,
in, and after predicates for describing program control.
It's customary to list authors alphabetically, unless one contributed
significantly more than the other, but at the time, I was unaware of
this custom. Here is Owicki's account of how the ordering of the
authors was determined.
As I recall it, you raised the
question of order, and I proposed alphabetical order. You
declined--I think you expected the paper to be important and didn't
think it would be fair to get first authorship on the basis of a
static property of our names. On the night we finished the paper, we
went out to dinner to celebrate, and you proposed that if the last
digit of the bill was even (or maybe odd), my name would be first.
And, indeed, that's the way it came out.
I have in my files a letter from David Harel, who was then an editor
of Information and Control, telling me that the paper was
accepted by the journal, after revision to satisfy some concerns of
the referees. I don't remember why I didn't submit a revised version.
I don't think I found the referees' requests unreasonable. It's
unlikely that I abandoned the paper because I had already developed
the method in [85], since that didn't appear as a SRC
research report until four years later. Perhaps I was just too busy.
At SRI, we were working on writing temporal logic specifications. One
could describe properties using temporal logic, so it seemed very
natural to specify a system by simply listing all the properties it
must satisfy. Richard Schwartz, Michael Melliar-Smith, and I
collaborated on a paper titled Temporal Logic Specification of
Distributed Systems, which was published in the Proceedings of the
2nd International Conference on Distributed Computing Systems, held in
Paris in 1981. However, I became disillusioned with temporal logic
when I saw how Schwartz, Melliar-Smith, and Fritz Vogt were
spending days trying to specify a simple FIFO queue--arguing over
whether the properties they listed were sufficient. I realized that,
despite its aesthetic appeal, writing a specification as a conjunction
of temporal properties just didn't work in practice. So, I had my
name removed from the paper before it was published, and I set about
figuring out a practical way to write specifications. I came up with
the approach described in this paper, which I later called the
transition axiom method.
Schwartz stopped working on specification and verification
in the mid-80s. He wrote recently (in June 2000):
[T]he same frustration with the use of temporal logic led Michael,
Fritz Vogt and me to come up with Interval Logic as a higher level
model in which to express time-ordered properties of events.
[See [43].] As you
recall, interval logic expressed constraints forward and backward
around significant events in order to more closely capture the way that
people describe event-driven behavior. Ultimately, I remain
unsatisfied with any of our attempts, from the standpoint of reaching
practical levels.
This paper is the first place I used the idea of describing a state
transition as a boolean-valued function of primed and unprimed
variables. However, by the early 80s, the idea must have been
sufficiently obvious that I didn't claim any novelty for it, and I
forgot that I had even used it in this paper until years later (see
the discussion of [101]).
Schneider agreed with me that invariance was the central
concept in reasoning about concurrent programs. He was also an expert
on all the different flavors of message passing that had been
proposed. We demonstrated in this paper that the basic
approach of [39] worked just was well with CSP; and we
claimed (without proof) that it also worked in other "distributed"
languages. I found it particularly funny that we should be the ones
to give a Hoare logic to CSP, while Hoare was using essentially
behavioral methods to reason about CSP programs. I'm still waiting
for the laughter.
The paper contains one figure--copied directly from a
transparency--with an obviously bogus algorithm. I tried to recreate
an algorithm from memory and wrote complete nonsense. It's easy to
make such a mistake when drawing a transparency, and I probably didn't
bother to look at it when I prepared the paper. To my knowledge, it
is the only incorrect algorithm I have published.
Contents
The Papers
Massachusetts
Institute of Technology, Project MAC Memorandum MAC-M-332, Artificial
Intelligence Project Memo Number Vision 111 (October 1966).
PDF
In the summer of 1966, I worked at the M.I.T. Artificial Intelligence
Laboratory, doing Lisp programming for a computer vision project. I
have no memory of this document, but it appears to describe the
programs I wrote that summer. It's of no technical interest, but it
does show that, even in those days, I was writing precise
documentation.
Mitre Technical Report
(December 1966).
No electronic version available.
While in graduate school, I worked summers and part-time at the Mitre
Corporation from 1962 to 1965. I think I wrote three or four
technical reports there, but this is the only one the Mitre library
seems to have. A large part of my time at Mitre was spent working on
the operating system for a computer being built there called Phoenix.
This is the operating system's manual, apparently written by Silver
based on work we had both done. There is nothing of technical
interest here, but it provides a snapshot of what was going on in the
world of computers in the early 60s.
Unpublished (circa 1967).
No electronic version available.
During the 1965-1969 academic years, I taught math at Marlboro
College. I don't remember exactly when or how the project got
started, but I wrote the first draft of an advanced calculus textbook
for Prentice-Hall, from whom I received an advance of 500
dollars. (That sum, which seems ridiculously small now, was a
significant fraction of my salary at the time.) The Prentice-Hall
reviewers liked the draft. I remember one reviewer commenting that
the chapter on exterior algebra gave him, for the first time, an
intuitive understanding of the topic. However, because of a letter
that was apparently lost in the mail, the Prentice-Hall editor and I
both thought that the other had lost interest in the project. By the
time this misunderstanding had been cleared up, I was ready to move on
to other things and didn't feel like writing a final draft. (This was
before the days of computer text processing, so writing a new draft
meant completely retyping hundreds of pages of manuscript.)
Unpublished
(circa 1968).
No electronic version available.
Marlboro College, where I taught math from 1965-1969, had a weekly
series of lectures for the general public, each given by a faculty
member or an outside speaker invited by a faculty member. I gave a
lecture about relativity that I later turned into this short
monograph. I made a half-hearted, unsuccessful effort to get it
published. But it was too short (75 pages) to be a "real" book, and
there was very little interest in science among the general public in
the late sixties. I think this monograph is still a very good
exposition of the subject. Unfortunately, the second half, on general
relativity, is obsolete because it says nothing about black holes.
While black holes appear in the earliest mathematical solutions to the
equations of general relativity, it was only in the late 60s that many
physicists began seriously to consider that they might exist and to
study their properties.
Communications of the ACM 13, 9 (September 1970).
PDF
Copyright © 1970 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
This short note describes a minor inefficiency I noticed in a
hash-table algorithm published by James Bell. It got me thinking
about hash tables, and I invented what I called the linear quotient
algorithm--an algorithm that seems quite obvious in retrospect.
While I was running simulations to gather data for a paper on that
algorithm, the latest issue of CACM arrived with a paper by
Bell and Charles Kaman titled The Linear Quotient Hash Code. I
had devised three variants of the algorithm not contained in their
article. So, I wrote a paper about those three variants and submitted
it to CACM. The editor rejected it, without sending
it out for review, saying that it was too small a contribution to
merit publication. In the next few years, CACM published two
papers by others on the subject, each completely devoted to one of my
three variants. (Those papers had been submitted to different
editors.) My paper, which was probably a Massachusetts Computer
Associates (Compass) technical report, has been lost. (Compass went
out of business a few years ago, and I presume that its library was
destroyed.) The linear quotient method is probably the most common
hash-coding algorithm used today.
Ph.D. Thesis, Brandeis University (1972).
Available from UMI
here, as number 7232105.
No electronic version available.
I left Marlboro College and went back to Brandeis in 1969 to complete
my Ph.D. At that time, I intended to study and write a
thesis in mathematical physics. However, I wound up doing a thesis in
pure mathematics, on analytic partial differential equations. I
learned nothing about analytic partial differential equations except
what was needed for my thesis research, and I have never looked at
them since then. The thesis itself was a small, solid piece of very
classical math. Had Cauchy arisen from the grave to read it, he would
have found nothing unfamiliar in the mathematics.
Bulletin of the Amer. Math. Society 79, 4 (July 1973), 776-780.
No electronic version available.
At the time, and perhaps still today, a math student "copyrighted"
his (seldom her) thesis results by announcing them in a short note in
the Bulletin of the AMS. Normally, a complete paper
would be published later. But I never did that, since I left math for
computer science after completing my thesis.
Proceedings of the 1973 Sagamore
Conference on Parallel Processing, T. Feng, ed., 1-12.
No electronic version available.
Compass (Massachusetts Computer Associates) had a contract to write
the Fortran compiler for the Illiac-IV computer, an array computer
with 64 processors that all operated in lock-step on a single
instruction stream. I developed the theory and associated algorithms
for executing sequential DO loops in parallel on an array computer
that were used by the compiler. The theory is pretty straightforward.
The creativity lay in the proper mathematical formulation of the
problem. Today, it would be considered a pretty elementary piece of
work. But in those days, we were not as adept at applying math to
programming problems. Indeed, when I wrote up a complete description
of my work for my colleagues at Compass, they seemed to treat it as a
sacred text, requiring spiritual enlightenment to interpret the occult
mysteries of linear algebra.
Communications of the ACM 17, 2
(February 1974), 83-93.
PDF
Copyright © 1974 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
This is the only journal paper to come out of the work mentioned in
the description of [8]. It contains essentially the
special case of the results in [8] for a single tight
nesting of loops. It was one of the early articles on the topic, and
I believe it was cited fairly often.
Proceedings of the 1974 Sagamore Conference on
Parallel Processing, T. Feng, ed., Springer Verlag, 1-12.
No electronic version available.
In the late 19th century, the Vanderbilts (a rich American family)
owned 500 hectares of beautiful land in the central Adirondack
Mountains of New York State that included Sagamore Lake, which is
about a kilometer in length. There, they built a rustic summer
estate. In the 70s, and probably still today, the place was
completely isolated, away from any other signs of civilization. The
estate, with land and lake, was given to Syracuse University,
which operated it as a conference center. An annual conference on
parallel processing was held there late in the summers of 1973 through
1975. Sagamore was the most beautiful conference site I have ever
seen. The conference wasn't bad, with a few good people attending,
though it wasn't first rate. But I would have endured a conference on
medieval theology for the opportunity to canoe on, swim in, and walk
around the lake.
Communications of the ACM 17, 8 (August 1974), 453-455.
PDF
Copyright © 1974 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
This paper describes the bakery algorithm for implementing mutual
exclusion. I have invented many concurrent algorithms. I feel that I
did not invent the bakery algorithm, I discovered it. Like all
shared-memory synchronization algorithms, the bakery algorithm
requires that one process be able to read a word of memory while
another process is writing it. (Each memory location is written by
only one process, so concurrent writing never occurs.) Unlike any
previous algorithm, and almost all subsequent algorithms, the bakery
algorithm works regardless of what value is obtained by a read that
overlaps a write. If the write changes the value from 0 to 1, a
concurrent read could obtain the value 7456 (assuming that 7456 is a
value that could be in the memory location). The algorithm still
works. I didn't try to devise an algorithm with this property. I
discovered that the bakery algorithm had this property after writing a
proof of its correctness and noticing that the proof did not depend on
what value is returned by a read that overlaps a write.
Massachusetts Computer Associates Technical Report CA
7412-0511 (5 December 1974).
No electronic version available.
This note was written upon reading Dijkstra's classic paper
"Self-stabilizing Systems in Spite of Distributed Control" that
appeared in the November 1974 issue of CACM (see
[57]). It generalizes one of the algorithms
in Dijkstra's paper from a line of processes to an arbitrary tree of
processes. It also discusses the self-stabilizing properties of the
bakery algorithm. I never tried to publish this note--probably
because I regarded it as too small a piece of work to be worth a paper
by itself.
Proceedings of a Conference on Programming Languages
and Compilers for Parallel and Vector Machines, published as
ACM SIGPLAN Notices 10, 3 (March 1975), 25-33.
No electronic version available.
This is a position paper advocating the use of a higher-level language
that expresses what must be computed rather than how it is to be
computed. It argues that compilers are better than humans at
generating efficient code for parallel machines. I was so naive then!
I soon learned how bad compilers really were, and how trying to make
them smarter just made them buggier. But compiler writers have gotten
a lot better, so maybe this paper isn't as stupid now as it was then.
Proceedings of the 1975 Sagamore Conference on
Parallel Processing, T. Feng, ed., 187-191.
No electronic version available.
This paper considers the problem of efficiently executing a sequence
of explicitly parallel statements--ones requiring simultaneous
execution for all values of a parameter--when there are more
parameter values than there are processors. It is one of the lesser
papers that I saved for the Sagamore Conference. (See the discussion
of [10].)
Communications of the ACM 18, 8 (August 1975), 471-475.
PDF
Copyright © 1975 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
My algorithms for parallelizing loops, described in papers starting
with [8], were rather inefficient. They could be sped
up with parallel execution on an array processor like the
Illiac-IV. But I realized one could do even better than the
64-times speedup provided by the Illiac's 64 processors. Each datum
being manipulated was just a few bits, so I had the idea of packing
several of the data into a single word and manipulating them
simultaneously. Not only could this speed computation on the Illiac,
but it allowed one to do array processing on an ordinary uniprocessor.
This paper describes general techniques for doing such parallel
computation on packed data. It's a neat hack, and it's more useful
now than it was then for two reasons. The obvious reason is that word
size is larger now, with many computers having 64-bit words. The less
obvious reason is that conditional operations are implemented with
masking rather than branching. Instead of branching around the
operation when the condition is not met, masks are constructed so the
operation is performed only on those data items for which the
condition is true. Branching is more costly on modern multi-issue
computers than it was on the computers of the 70s.
Acta Informatica 7, 1 (1976), 15-34.
No electronic version available.
There are a class of synchronization problems that are direct
generalizations of mutual exclusion in that they assert constraints on
when a process is allowed to perform a task. They include the dining
philosophers problem and the readers/writers problem. This paper
shows how a modified version of the bakery algorithm can be used to
solve any such problem.
Information Processing Letters 4, 4 (January 1976),
88-89.
No electronic version available.
This is a comment on a short note by Richard Lipton and Robert Tuttle
claiming to find an inadequacy in Dijkstra's P and V synchronization
primitives. It points out that they had introduced a red herring
because the problem that those primitives couldn't solve could not be
stated in terms of entities observable within the system. As my note
states: "A system cannot be correct unless its correctness depends
only upon events and conditions observable within the system." That's
something worth remembering, since I've encountered that same sort of
red herring on other occasions. My observation is relevant to
[62], but I had forgotten all about this note by the time
I wrote [62].
Proceedings of the 1976 International
Conference on Parallel Processing, T. Feng, ed., 50-54.
No electronic version available.
This is a minor extension to the concurrent garbage collection
algorithm of [30]. That algorithm uses a single garbage
collector process running in parallel with the "mutator" process
that creates the garbage. This paper describes how to use multiple
processes to do the collection, and how to handle multiple mutator
processes. It is a minor work that I wrote up as an excuse for
going to the Sagamore conference. (See the discussion of
[10].) However, the conference had been renamed and
moved from Sagamore to a less attractive, mosquito-infected site in the
Michigan woods. It was the last time I attended.
Unpublished (August 1976).
No electronic version available.
This is a totally revised version of [8], complete
with proofs (which had been omitted from [8]). I
submitted it to CACM, but the editor of CACM decided
that it was more appropriate for JACM. When I
submitted it there, the editor of JACM rejected it without
sending it out for review because the basic ideas had already appeared
in the conference version [8]. I was young and
inexperienced, so I just accepted the editor's decision. I began to
suspect that something was amiss a year or two later, when a paper
from the same conference was republished verbatim in
CACM. By the time I realized how crazy the editor's
decision had been, it didn't seem worth the effort of resubmitting the
paper.
Rejected by the 1977 IFIP Congress
(October 1976).
No electronic version available.
This paper was a rough draft of some ideas, not all of which were
correct. I don't remember how it became known, but I received
requests for copies for years afterwards.
Rejected
by IEEE Transactions on Computers (November 1976).
Postscript -
Compressed Postscript -
PDF
When I wrote [11], a colleague at Massachusetts Computer
Associates pointed out that the concurrent reading and writing of a
single register, assumed in the bakery algorithm, requires an
arbiter--a device for making a binary decision based on inputs that
may be changing. In the early 70s, computer designers rediscovered
that it's impossible to build an arbiter that is guaranteed to reach a
decision in a bounded length of time. (This had been realized in the
50s but had been forgotten.) My colleague's observation led to my
interest in the arbiter problem--or "glitch" problem, as it was
sometimes called.
IEEE Transactions on Software Engineering SE-3, 2
(March 1977), 125-143.
No electronic version available.
When I first learned about the mutual exclusion problem, it seemed
easy and the published algorithms seemed needlessly complicated. So,
I dashed off a simple algorithm and submitted it to
CACM. I soon received a referee's report pointing
out the error. This had two effects. First, it made me mad enough at
myself to sit down and come up with a real solution. The result was
the bakery algorithm described in [11]. The second effect
was to arouse my interest in verifying concurrent algorithms. This
has been a very practical interest. I want to verify the algorithms
that I write. A method that I don't think is practical for my
everyday use doesn't interest me.
Programmation-2me Colloque
International, B. Robinet, ed., Dunod, Paris (1977), 1-8.
No electronic version available.
This is an abbreviated, conference version of [22]. (In
those days, publication was fast enough that the journal version could
appear before the conference version.) I wrote this paper as an
excuse for attending a conference in Paris. However, it turned out
that I went to Europe for other reasons that made it impossible for me
to attend the conference. I tried to withdraw the paper, but it
was too late.
Communications of the ACM 20, 11
(November 1977), 806-811.
PDF
Copyright © 1977 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
This paper came out of my study of the bakery algorithm of
[11]. The problem with that algorithm is that it requires
unbounded state. To allow the state to be bounded in practice, I
needed an algorithm for reading and writing multidigit numbers, one
digit at a time, so that a read does not obtain too large a value if
it overlaps a write. This paper shows that this can be done by simply
writing the digits in one direction and reading them in the other. It
also has some other nice algorithms.
The paper assumes that reading and writing a single digit are atomic
operations. The original version introduced the notion of a regular
register and proved the results under the weaker assumption that the
individual digits were regular. However, the editor found the idea of
nonatomic reads and writes to individual digits too heretical for
CACM readers, and he insisted that I make the stronger
assumption of atomicity. So, the world had to wait another decade,
until the publication of [69], to learn about regular
registers.
ACM SIGSOFT Software Engineering Notes
3, 1 (January 1978) 26.
PDF
Copyright © 1978 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
The title says it all. This one-page note is as relevant today as
when I wrote it. Replace "describing the solution" by "writing the
program" and it becomes a practical recipe for improving software.
Communications of the ACM 21, 7 (July 1978), 558-565. Reprinted in
several collections, including Distributed Computing:
Concepts and Implementations, McEntire et al., ed. IEEE Press, 1984.
PDF
Copyright © 1978 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
Jim Gray once told me that he had heard two different opinions of this
paper: that it's trivial and that it's brilliant. I can't argue with
the former, and I am disinclined to argue with the latter.
Proceedings of the International
Conference on Mathematical Studies of Information Processing Kyoto,
Japan (August, 1978), 477-540.
No electronic version available.
In the late 70s, people were talking about designing programming
languages that would make program verification easier. I didn't think
much of that idea. I felt that the difficulty in verification comes
from the algorithm, not the details of the programming language in
which it is described. To demonstrate this view, I published in this
paper a proof of correctness of a TECO program.
Computer Networks 2 (1978), 95-114.
PDF
All copyrights reserved by Elsevier Science 1978.
In [26], I introduced the idea of implementing any
distributed system by using an algorithm to implement an arbitrary
state machine in a distributed system. However, the algorithm in
[26] assumed that processors never fail and all
messages are delivered. This paper gives a fault-tolerant algorithm.
It's a real-time algorithm, assuming upper bounds on message delays in
the absence of faults, and that nonfaulty processes had clocks
synchronized to within a known bound.
Proceedings of the IEEE
66, 10 (October 1978), 1240-1255.
PDF
When it became clear that computers were going to be flying commercial
aircraft, NASA began funding research to figure out how to make them
reliable enough for the task. Part of that effort was the SIFT
project at SRI. This project was perhaps most
notable for producing the Byzantine generals problem and its
solutions, first reported in [40].
Communications of the ACM 21, 11
(November 1978), 966-975.
PDF
Copyright © 1978 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
This paper presents the first concurrent garbage collection
algorithm--that is, an algorithm in which the collector operates
concurrently with the process that creates the garbage. The paper is
fairly well known; its history is not.
I, too, have always been most interested in showing that something
could be done, rather than in finding a better algorithm for doing
what was known to be possible. Perhaps that is why I have always been
so impressed by the brilliance of Dijkstra's work on concurrent
algorithms.
ACM SIGPLAN Notices 14, 3 (March 1979)
38-42.
PDF
Copyright © 1979 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
Fortunately, this note has been completely forgotten. It was written
when people were still proposing new programming-language constructs.
The one presented here may have little to recommend it, but it was no
worse than many others.
ACM Transactions on Programming Languages and Systems 1, 1 (July 1979), 84-97.
PDF
Copyright © 1979 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
Like everyone else at the time, when I began studying concurrent
algorithms, I reasoned about them behaviorally. Such reasoning
typically involved arguments based on the order in which events occur.
I discovered that proofs can be made simpler, more elegant, and more
mathematical by reasoning about operations (which can be composed of
multiple events) and two relations on them: precedes (denoted
by a solid arrow)
and can affect (denoted by a dashed arrow).
Operation A precedes operation B if all the events of
A precede all the events of B; and A can affect
B if some event in A precedes some event in B.
These relations obey some simple rules that can reduce behavioral
reasoning to mathematical symbol pushing.
Unpublished note, 4 August 1979.
Text File
This three-page note is about presenting a paper at a conference, but
it offers good advice for any talk. Except for a couple of
suggestions about hand-written slides, it still applies today.
IEEE Transactions on Computers
C-28, 9 (September 1979) 690-691.
PDF
I forget what prompted me to be thinking about memory caching, but it
occurred to me one day that multiprocessor synchronization algorithms
assume that each processor accesses the same word in memory, but each
processor actually accesses its own copy in its cache. It hardly
required a triple-digit IQ to realize that this could cause problems.
I suppose what made this paper worth reading was its simple, precise
definition of sequential consistency as the required correctness
condition. This was not the first paper about cache coherence.
However, it is the early paper most often cited in the cache-coherence
literature.
SRI International Technical Report CSL-98 (October 1979).
No electronic version available.
At a coffee house in Berkeley around 1975, Whitfield Diffie described
a problem to me that he had been trying to solve: constructing a
digital signature for a document. I immediately proposed a solution.
Though not very practical--it required perhaps 64 bits of published
key to sign a single bit--it was the first digital signature
algorithm. Diffie and Hellman mention it in
their classic paper:
(I think it's at the bottom right of page 650.)
Communications of the ACM 22, 10 (October 1979), 554-556.
PDF
Copyright © 1979 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
In the May, 1978 CACM, Matthew Geller published a paper titled
Test Data as an Aid in Proving Program Correctness. He argued
that there were some programs whose correctness is so hard to state
formally that formally verifying them is useless because the
specification is likely to be wrong. He gave as an example a program
to compute the number of days between two dates in the same year,
claiming that it would be hard to check the correctness of a precise
statement of what it meant for the program to be correct. This paper
proved him wrong. (It also makes the amusing observation that
Geller's solution is wrong because it fails for dates before the
advent of the Gregorian calendar.) As a bonus, readers of this paper
were alerted well in advance that the year 2000 is a leap year.
Communications of the ACM 22, 11
(November 1979), 624.
PDF
Copyright © 1979 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
In the May, 1979 CACM, De Millo, Lipton, and Perlis published
an influential paper titled Social Process and Proofs of
Theorems and Programs. This paper made some excellent observations.
However, by throwing in a few red herrings, they came to some wrong
conclusions about program verification. More insidiously, they framed
the debate as one between a reasonable engineering approach that
completely ignores verification and a completely unrealistic view of
verification advocated only by its most naive proponents. (There
were, unfortunately, quite a few such proponents.)
At that time, some ACM publications had a special section on
algorithms. In an ironic coincidence, the same issue of CACM
carried the official ACM policy on algorithm submissions. It included
all sorts of requirements on the form of the code, and even on the
comments. Missing was any requirement that the correctness of the
algorithm be demonstrated in any way.
I was appalled at the idea that, ten years after Floyd and Hoare's
work on verification, the ACM was willing to publish algorithms with
no correctness argument. The purpose of my letter was to express my
dismay. I ironically suggested that they had succumbed to the
arguments of De Millo, Lipton, and Perlis in their policy. As a
result, my letter was published as a rebuttal to the De Millo, Lipton,
and Perlis paper. No-one seems to have taken it for what it was--a
plea to alter the ACM algorithms policy to require that there be some
argument to indicate that an algorithm worked.
Proceedings of the Seventh ACM Symposium on Principles
of Programming Languages, ACM SIGACT-SIGPLAN (January 1980).
PDF
Copyright © 1980 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
After graduating from Cornell, Susan Owicki joined the faculty of
Stanford. Some time around 1978, she organized a seminar to study the
temporal logic that Amir Pnueli had recently introduced to computer
science. I was sure that temporal logic was some kind of abstract
nonsense that would never have any practical application, but it
seemed like fun, so I attended. I observed that people got very
confused because, in Pnueli's logic, the concepts of always and
eventually mean what they do to ordinary people. In
particular, something is not always true if and only if it is
eventually false. (It doesn't always rain means that it eventually
stops raining.) However, most computer scientists have a different
way of thinking. For them, something is not always true if and only
if it might possibly become false. (The program doesn't always
produce the right answer means that it might produce the wrong
answer.)
Acta
Informatica 14, 1 (1980), 21-37.
No electronic version available.
As explained in the discussion of [22], the Owicki-Gries
method and its variants are generalizations of Floyd's method for
reasoning about sequential programs. Hoare's method formalizes
Floyd's with a set of axioms for deriving triples of the form
{P}S{Q}, which means that if statement
S is executed from a state in which P is true and
terminates, then Q will be true. This paper introduces the
generalization of Hoare's method to concurrent programs, replacing
Hoare triples with assertions of the form {I}S,
which means that the individual actions of statement S leave
the predicate I invariant.
Journal of the
Association for Computing Machinery 27, 2 (April 1980).
PDF
Copyright © 1980 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
Before this paper, it was generally assumed that a three-processor
system could tolerate one faulty processor. This paper shows that
"Byzantine" faults, in which a faulty processor sends inconsistent
information to the other processors, can defeat any traditional
three-processor algorithm. (The term Byzantine didn't appear
until [45].) In general, 3
Transactions of the American Nuclear
Society 35 (November 1980), 252-253.
No electronic version available.
In 1980, J Moore and I were both at SRI and had been
involved in the verification of SIFT (see [29]). The nuclear
power industry was, for obvious reasons, interested in the correctness
of computer systems. I forget how it came to pass, but Moore and I
were invited to present a paper on verification at some meeting of
power industry engineers. This was the result.
Communications of the ACM 24, 11 (November 1981), 770-772.
PDF
Copyright © 1981 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
Despite a casual interest in civilian cryptography going back to its
origins (see the discussion of [35]), this is my only
publication in the field. It presents a cute hack for using a single
password to login to a system multiple times without allowing an
adversary to gain access to the system by eavesdropping. This hack is
the basis of Bellcore's S/KEY system and of the PayWord system of
Rivest and Shamir.
Logics of Programs, Dexter Kozen
editor, Springer-Verlag Lecture Notes in Computer Science
Volume 131 (1982), 177-196.
No electronic version available.
Pnueli's introduction of temporal logic in 1977 led to an explosion of
attempts to find new logics for specifying and reasoning about
concurrent systems. Everyone was looking for the silver-bullet logic
that would solve the field's problems. This paper is proof that I was
not immune to this fever. For reasons explained in the discussion of
[49], it is best forgotten. Some people may find this of
historical interest because it is an early example of an interval
logic.
Unpublished (April 1982).
Postscript -
Compressed Postscript -
PDF
I visited Michael Fischer at Yale in the spring of 1982. It was known
that solutions to the Byzantine generals problem that can handle
n Byzantine failures require n+1 rounds of
communication. While I was at Yale, Fischer and I proved that this
number of rounds were needed even to handle more benign failures.
ACM Transactions on Programming Languages and Systems 4, 3 (July 1982), 382-401.
PDF
Copyright © 1982 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
I have long felt that, because it was posed as a cute problem about
philosophers seated around a table, Dijkstra's dining philosopher's
problem received much more attention than it deserves. (For example,
it has probably received more attention in the theory community than
the readers/writers problem, which illustrates the same principles and
has much more practical importance.) I believed that the problem
introduced in [40] was very important and deserved the
attention of computer scientists. The popularity of the dining
philosophers problem taught me that the best way to attract attention
to a problem is to present it in terms of a story.
ACM Transactions on Programming Languages and Systems 4, 3 (July 1982), 455-495.
PDF
Copyright © 1982 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
During the late 70s and early 80s, Susan Owicki and I worked together
quite a bit. We were even planning to write a book on concurrent
program verification. But this paper is the only thing we ever wrote
together.
Science of Computer Programming 2, 3 (December
1982), 175-206.
No electronic version available.
I showed in [26] that there is no invariant way of
defining the global state of a distributed system. Assertional
methods, such as [22], reason about the global state. So,
I concluded that these methods were not appropriate for reasoning
about distributed systems. When I wrote this paper, I was at SRI and
partly funded by a government contract for which we had promised to
write a correctness proof of a distributed algorithm. I tried to
figure out how to write a formal proof without reasoning about the
global state, but I couldn't. The final report was due, so I decided
that there was no alternative to writing an assertional proof. I knew
there would be no problem writing such a proof, but I expected that,
with its reliance on an arbitrary global state, the proof would be
ugly. To my surprise, I discovered that the proof was quite elegant.
Philosophical considerations told me that I shouldn't reason about
global states, but this experience indicated that such reasoning
worked fine. I have always placed more reliance on experience than
philosophy, so I have written assertional proofs of distributed
systems ever since. (Others, more inclined to philosophy, have spent
decades looking for special ways to reason about distributed systems.)
Proceedings of the Tenth ACM Symposium on Principles
of Programming Languages, ACM SIGACT-SIGPLAN (January 1983), 28-37.
PDF
Copyright © 1983 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
From the time I discovered the bakery algorithm (see [11]),
I was fascinated by the problem of reasoning about a concurrent
program without having to break it into indivisible atomic actions.
In [32], I described how to do this for behavioral
reasoning. But I realized that assertional reasoning, as described in
[22], was the only proof method that could scale to more
complex problems. This paper was my first attempt at assertional
reasoning about nonatomic operations. It introduces the win
(weakest invariant) operator that later appeared in
[85], but using the notation of Pratt's dynamic logic
rather than Dijkstra's predicate transformers.
ACM Transactions on Programming Languages and Systems 5, 2
(April 1983), 190-222.
PDF
Copyright © 1983 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
The early methods for reasoning about concurrent programs dealt with
proving that a program satisfied certain properties--usually
invariance properties. But, proving particular properties showed only
that the program satisfied those properties. There remained the
possibility that the program was incorrect because it failed to
satisfy some other properties. Around the early 80s, people working
on assertional verification began looking for ways to write a complete
specification of a system. A specification should say precisely what
it means for the system to be correct, so that if we prove that the
system meets its specification, then we can say that the system
really is correct. (Process algebraists had already been working on
that problem since the mid-70s, but there was--and I think still
is--little communication between them and the assertional
verification community.)
In Highly Dependable
Distributed Systems, final report for SRI Project 4180 (Contract
Number DAEA18-81-G-0062) (June 1983).
No electronic version available.
In the spring of 1983, I was called upon to contribute a chapter for
the final report on a project at SRI. I chose to write a
specification and correctness proof of a Byzantine general's
algorithm--a distributed, real-time algorithm. (Nonfaulty components
must satisfy real-time constraints, and the correctness of the
algorithm depends on these constraints.) I began the exercise on a
Wednesday morning. By noon that Friday, I had the final typeset
output. I presume there are lots of errors; after finishing it, I
never reread it carefully and I have no indication that anyone else
did either. But, I have no reason to doubt the basic correctness of
the proof. I never published this paper because it didn't seem worth
publishing. The only thing I find remarkable about it is that so many
computer scientists are unaware that, even in 1983, writing a formal
correctness proof of a distributed real-time algorithm was an
unremarkable feat.
Journal of
the Association for Computing Machinery 30, 3 (July 1983), 668-676.
PDF
Copyright © 1983 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
This paper introduces a weaker version of the Byzantine generals
problem described in [40]. The problem is "easier"
because there exist approximate solutions with fewer than
3n processes that can tolerate n faults,
something shown in [40] to be impossible for the original
Byzantine generals problem. I don't remember how I came to consider
this problem.
SRI International Technical Report (August 1983).
No electronic version available.
SRI had a contract with Philips to design a graphical editor for
structured documents (such as programs). Goguen and I were the prime
instigators and principle investigators of the project. This is the
project's final report. Rindert Schutten of Philips visited SRI and
implemented a very preliminary version. I felt that our design was
neither novel enough to constitute a major contribution nor modest
enough to be the basis for a practical system at that time, and I
thought the project had been dropped. However, Goguen informed me
much later that some version of the system was still being used in the
early 90s, and that it had evolved into a tool for VLSI layout,
apparently called MetaView.
Information
Processing 83, R. E. A. Mason, ed., Elsevier Publishers (1983),
657-668.
PDF
This was an invited paper. It describes the state of my views on
specification and verification at the time. It is notable for
introducing the idea of invariance under stuttering and explaining why
it's a vital attribute of a specification logic. It is also one of my
better-written papers.
ACM Transactions on Programming Languages and Systems 6, 2 (April 1984),
254-280.
PDF
Copyright © 1984 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
The genesis of this paper was my realization that, in a multiprocess
system with synchronized clocks, the absence of a message can carry
information. I was fascinated by the idea that a process could
communicating zillions of bits of information by not sending
messages. The practical implementation of Byzantine generals
algorithms described in [45] could be viewed as an application
of this idea. I used the idea as something of a gimmick to
justify the paper. The basic message of this paper should have been
pretty obvious: the state machine approach, introduced in
[26], allows us to turn any consensus algorithm into
a general method for implementing distributed systems; the Byzantine
generals algorithms of [45] were fault-tolerant consensus
algorithms; hence, we had fault-tolerant implementations of
arbitrary distributed systems. I published the paper because I had
found few computer scientists who understood this.
ACM Transactions on Programming Languages and Systems 6, 2 (April 1984), 281-296.
PDF
Copyright © 1984 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
I felt that in [39], I had presented the right way to do
assertional (also known as Owicki-Gries style) reasoning about
concurrent programs. However, many people were (and perhaps still
are) hung up on the individual details of different programming
languages and are unable to understand that the same general
principles apply to all of them. In particular, people felt that
"distributed" languages based on rendezvous or message passing were
fundamentally different from the shared-variable language that was
considered in [39]. For example, some people made the silly
claim that the absence of shared variables made it easier to write
concurrent programs in CSP than in more conventional languages. (My
response is the equally silly assertion that it's harder to write
concurrent programs in CSP because the control state is shared between
processors.)
Proceedings of the Third Annual ACM Symposium on
Principles of Distributed Computing (August, 1984), 68-74.
PDF
Copyright © 1984 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
This is the preliminary conference version of [61].
Proceedings of the Third Annual ACM
Symposium on Principles of Distributed Computing (August, 1984)
1-11.
PDF
Copyright © 1984 by the Association for Computing Machinery, Inc.Permission to make digital or hard copies of part
or all of this work for personal or classroom use
is granted without fee provided that copies are
not made or distributed for profit or commercial
advantage and that copies bear this notice and
the full citation on the first page. Copyrights
for components of this work owned by others than
ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
Request permissions from Publications Dept, ACM
Inc., fax +1 (212) 869-0481, or
permissions@acm.org. The definitive version of
this paper can be found at ACM's Digital Library
--http://www.acm.org/dl/.
This is the invited address I gave at the 1983 PODC conference, which
I transcribed from a tape recording of my presentation. The first few
minutes of the talk were not taped, so I had to reinvent the
beginning. This talk is notable because it marked the rediscovery by
the computer science community of Dijkstra's 1974 CACM paper that
introduced the concept of self-stabilization. A self-stabilizing
system is one that, when started in any state, eventually "rights
itself" and operates correctly. The importance of self-stabilization
to fault tolerance was obvious to me and a handful of people, but went
completely over the head of most readers. Dijkstra's paper gave
little indication of the practical significance of the problem, and
few people understood its importance. So, this gem of a paper had
disappeared without a trace by 1983. My talk brought Dijkstra's paper
to the attention of the PODC community, and now self-stabilization is
a regular subfield of distributed computing. I regard the
resurrection of Dijkstra's brilliant work on self-stabilization to be
one of my greatest contributions to computer science.
Unpublished
(October, 1984).
Postscript -
Compressed Postscript -
PDF
This three-page note gives an example that appears to contradict a
theorem in a TOPLAS article by Gary Peterson. Whether or not
it does depends on the interpretation of the statement of the theorem,
which is given only informally in English. I draw the moral that
greater rigor is needed. When I sent this