This document was made by OCR from a scan of the technical report. It has not been edited or proofread and is not meant for human consumption, but only for search engines. To see the scanned original, replace OCR.htm with Abstract.htm or Abstract.html in the URL that got you here.


Acta Informatica 10, 1— 26 (1978)

 

 

Proof Rules for the Programming Language Euclid

R.L. Londonl*, J.V. Guttag *lc, J.J. Horning 3 ***, B.W. Lampson J.G. Mitchell', and G.J. Popek5****

USC Information Sciences Institute. 4676 Admiralty Way, Marina del Rey, CA 90291, USA = Computer Science Dept., University of Southern California, Los Angeles, CA 90007, USA Computer Systems Research Group, University of Toronto, Toronto, M5S 1A4, Canada Xerox Research Center, 3333 Coyote Hill Road, Palo Alto, CA 94304, USA

5 3532 Boelter Hall, Computer Science Dept., University of California, Los Angeles, CA 90024, USA

Summary. In the spirit of the previous axiomatization of the programming language Pascal, this paper describes Hoare-style proof rules for Euclid, a programming language intended for the expression of system programs which are to be verified. All constructs of Euclid are covered except for storage allocation and machine dependencies.

-The symbolic form of the work has been forced upon us by necessity: without its help we should hare been unable to perform the requisite reasoning.-

A.N. Whitehead and B. Russell. Pri ne ipi a 1hithematica.

p. vii

Rules are rules."      Anonymous

Introduction

The programming language Euclid has been designed to facilitate the construction of verifiable system programs. Its defining report [11] closely follows the defining report [15] of the Pascal language (see also [10]). The present document, giving Hoare-style proof rules applicable only to legal Euclid programs, owes a great deal to (and is in part identical to) the axiomatic definition of Pascal [9]. Major differences from [9] include the treatment of procedures and functions, declara­tions. modules, collections. escape statements, binding, parameterized types, and the examples and detailed explanations in Appendices 1-3. Other semantic defini‑

* To whom offprint requests should be sent. Supported by the Defense Advanced Research Projects Agency under contract DAHC-15-72-C-0308

** Supported in part by the National Science Foundation under grant MCS-76-86089 and the Joint Services Electronics Program monitored by the Air Force Office of Scientific Research under contract F 44620-76-C-0061

*** Supported in part by a Research Leave Grant from the University of Toronto and a grant from the National Research Council of Canada. Current address: Xerox Research Center **** Supported in part by the Defense Advanced Research Projects Agency under contract DAHC­73-C-0368. The views expressed are those of the authors


don methods are certainly applicable to Euclid. We have used proof rules for two reasons: familiarity and the existence of the Pascal definition. Readers unfamiliar with proof rules may to read [6, 7].

One may regard the proof rules as a definition of Euclid in the same sense as the Pascal axiomatization defines Pascal. By stating what can be proved about Euclid programs, the rules define the meaning of most syntactically and seman­tically legal Euclid programs, but they do not give the information required to determine whether or not a program is legal. This information may be found in the language report. Neither do the proof rules define the meaning of illegal Euclid programs containing, for example, division by zero or an invalid array index. Finally, explicit proof rules arc not provided for those portions of Euclid defined in the report by translation into other legal Euclid constructs. This includes pervasive, implicit imports through thus, and some uses of return and exit. All such transformations must be applied before the proof rules are

applicable.

As is the case with Pascal, the Euclid axiomatization should be read in conjunction with the language report, and is an almost total axiomatization of a realistic and useful system programming language. While the primary goal of the Euclid effort was to design a practical programming language (not to provide a vehicle for demonstrating proof rules), proof rule considerations did have significant influence on Euclid [13]. All constructs of the language are covered except for storage allocation (zones and collections that are not referenced-counted) and machine dependencies. In a few instances rules arc applicable only to a subset of Euclid; the restrictions are noted with those rules.

Conventions         Notation

In describing these Euclid proof rules we have used as much as possible of the Pascal axiomatization. We have also deliberately followed the same order and style of presentation and tried to use the same terminology.

The notation P(y/x) to denote substitution is used for the formula which is obtained by systematically substituting v for all free occurrences of x in the predicate P. If this introduces conflict between free variables of y and bound variables of P, the conflict is resolved by systematic change of the latter variables. The notation P(31v) may he read "P with y for .x."

P(.1.1x1,                4'J•„)

denotes simultaneous substitution for all occurrences of any x; by the correspond­ing v1. Thus occurrences of x-; within any yi are not replaced. The expressions

x,,     x„ must be distinct; otherwise the simultaneous substitution is not defined.
The meaning of substitution for subscripted and qualified expressions is defined

in Sections 4 and 5.

Euclid expressions in assertions must be legal Euclid expressions. Assertions may contain, outside of expressions, non-Euclid notations such as quantifiers or

In Ihe proof rules, S and Si denote a sequence of zero or more statements. As is done in the Pascal axiomatization, we refer to values of a type as elements of that type. Also, the rule of consequence

PQ,O{S}LI,UR P{S} R

may be used in proofs.

Data Types

The axioms presented in this and the following sections display the relationship between a type declaration and the axioms which specify the properties of values of the type and operations defined over them. The treatment is not wholly formal, and the reader must be aware that:

1.   Free variables in axioms are assumed to he universally quantified over the appropriate type.

2.   The expression of the "induction" axiom is always left informal.

3.   The types of variables used in the axioms have to be deduced either from the section heading or from the more immediate context.

4.  The name of a type is used as a transfer function constructing a value of the type (such a use of the type identifier is not available in Euclid).

5.  The verifier can assume that everything will be type-checked by the compiler or that the compiler will generate the necessary legality assertions.

6.   In defining Euclid's types we will not be presenting proof rules that may he directly applied to Euclid programs. Rather, we will provide a set of assertions, denoted by about the values of the type being defined. These assertions are incorporated into proof rules in Section 9 on constant, variable, and type declara­tions. Rule 9.1, for example, tells us that in the case of an identifier declaration,

we may use the fact that )(ET, i.e., x has the attributes associated with values of type T

Parameterized types arc covered in Section 9.5. The rules for type compatibility in Euclid are defined in the language report.

Scalar Types

type T=(c,,c2,...,c,,)

I.I. c,, c2,                    c„ are distinct elements of T.

.2.     These are the only elements of T.

.3.         = T. slice (ed for i= I,   n

.4.   c1= T. pred(ci .0) for 1=         n — 1

.5.                   < 

.6.     < A (37 < 1--) <

.7.     (x   e„) (x < T. sitee(x))
.8.


I.).        A lc v •=7.. --i(As>s)

I.10,       ,t.    —I       <

1.11. X   Y           ty

1.12. T . first  e,

1.13. 7'. last =e„

1.14. T. ord(cd=i — 1 for i=1,      11.

The standard scalar type Boolean is defined as

type Boo/ean = (False, True).

The Boolean operators not, and, or, and (implication) are those of the con­ventional logical calculus except that some operands may possibly not be evaluated. Specifically, in terms of conditional expressions,

x and v means if x then v else false, i.e., x cand y, x or i' means if x then true else y, i.e., x cor y, X r means if x then v else true.

(Note, however, that conditional expressions are not included in Euclid.)

The standard type integer stands for the set of the whole numbers. The arithmetic operators +. *, and div are those of whole number arithmetic. The modulus operator mod is defined by the equation2.4. u < r char . ord (u)< char . ard(c).

These axioms have been designed to make possible interchange of programs between implementations using different character sets. It should be noted that the function char. ord does not necessarily map the characters onto consecutive

integers.

Subrange Types

type T=nt..n

Let m, n be elements of scalar type T0.

3.1. T. first =in

3.2. 7'. last =

3.3. For all i in T0 such that m        a, i is an element of T.
3.4. These are the only elements of 7:

3.5. T.. first i < T. last P T. succ(i)=       succ(i)

3.6. 7'. first <i T. last P7'. pred (0-- T0. prcd(i).

Note that since all elements of T are elements of       all operators of To apply
to them.


Text Box: Array Types
type T=array I of "T„
Ilcre 1 is a scalar or subrange type. Let in= I . first and a= I . last.
4.1. If	is an element of 7 for all i such that m	Sit, then T(x,„,	x„) is an
clement of T
4.2. These are the only elements of T 4.3. T(x,„,	.v,,)(i)= x; for in	Pl.
Letting x stand for T(x„„	x„) we introduce the following abbreviation:
m mod — m —(m div n)*n

whereas div denotes division with truncated fraction, i.e., move toward zero. Implementations are permitted to refuse the execution of programs which refer to integers outside the appropriate range, signedlat or unsignedlat.

Type Char

2.1. The elements of type char are the 26 (capital) letters, the 10 (decimal) digits, and possibly other characters defined by particular implementations. In programs, a constant of type Char is denoted by preceding the character

with a S.

The sets of letters and digits are ordered, and the digits arc coherent, i.e.,             (x,           stands for T(x,„, ...,Xf_ t, y, xj±i,


2.2. A <S            $0.<5b             $ 1= char . succ ($ 0)

B <S C       $b<Sc         $ 2= char . succ (S 1)

•••

)'<5 Z    $ y <5 2.        5 9= char . succ($ 8).

Axioms 1.5-1.13 apply to the char type. The functions char ore! and Chr arc defined by the following additional axioms:

13. If u is an clement of char, then char. ord(u) is a non-negative integer (called the ordinal number of u), and

Chr(char. and (u))= u


The formula 1'(y/v(i)),

denoting a substitution for an array reference, is then defined to be

P((x, I: 17)/x).

To extend this definition to substitution for a multiply-subscripted array reference, we define

(x,                ...,1„): y), where n 2


12.I            et            ('roof Rules l'or Inc Programming Language Euclid  7


o he equal to

in: y))

and the formula

My/x(i,)            (i„)), where n 2

is equivalent to

P((x, <i,,                  y)/x).

4.4. x. I ndexType = x • Component Type = To.

Record Types

type T = record,

S. ; 71, • •• • Sru:      end T

where each s, must be preceded by const or var.
Let x, be an element of Ti for          = 1, ..., m.

5.1. 7.(x,, x2, ..., x,„) is an element of T 5.2. These are the only elements of T. 5.3. T(x,,             x„,) si= xi   for   i= 1,      , tn.

(x,     r) stands for T(x,, ...,              y, xi+                       x„).

The formula P(y/x s1) is defined analogously to arrays (sec definitions fol­lowing 4.3).

1,itriant Records

type 7;,(c).--- record          , s„,: TM;

case c of

k,         s',: T,' end k,
k2=--> 52' :
7; end k2

k„         s„: T„' end k„

end case

end T

where the type of c is a scalar or subrange type with elements k                    k„. Note

that I. k,„                S stands for k„    S;        S; ...;             S.

Consider type T =To(kj), and let x'j be an element of 77 for j= I, --11.

5.1 a. T(x,,      x„„ kj, .9 is an element of T.
5.2a. These are the only elements of T.

5.3 a. Ttv,       „„ k      . sir= x , for i = 1, ..., m5.4a. 7'(x1,          x,,                si=x; for = 1,        n

5.5 a. Letting z stand for T(x, ,                  x„„               x'j), the standard Euclid component
itsTag is defined by z. itsThg= kj.

A variant record may also contain an otherwise clause otherwise           7;;+,

in which case the type of c may contain elements in addition to k1, k„. If there is such a clause, we have the following additional axioms for the otherwise (n + 1) situation:

Let kJ be an element of the type of c such that kj+ki for I . in, and let x;,,, be an element of

5.1 b. T(x1,            x„„ kj, x„   ,) is also an element of T.

5.4 b. T(x, ,                     kj,                                     x„,1

5.511 Identical to 5.5a.

The case with a field list containing several fields

sj, :                             end kJ

is to be interpreted as

end ki

where      is a fresh identifier, and 77 is a type defined as

type 77 = record      To,............. Th end record.

In this case x. sj, is interpreted as x.                sj,.

Now consider type T = Tdany).

5.1 c. T(x,,                   y, x') is an element of T for (y, x') = (1(1, xi),      ,(k„, x„), (k,,,

                  where k, ki for   = 1,          It.

5.2c. These are the only elements of T.

5.3c. T(x, ,         x,„, y, x')   si= x, for      i= 1, ...,

There is no 5.4c since access to the variant part of the record may only be done using the discriminating case statement (see Section 12.5).

Machine-dependent records do not change verification properties. The ad‑

ditional information supplied can be ignored (unless the machine interprets certain locations specially).

Set Types

type T.—set of To

Let x, be an element of To.

6.1. T( ) is an element of T.

6.2. If x is an element of T, then x+ T(xo) is an element of T (the operator + is defined below).


6.3. These arc the only elements of T

6.4. TCV/„X2....,.x„) means T([[T( )+ T(x1)] + M2)] + • + 71xJ1- 6.5, T. Base Type To.

It I denotes the empty set, and T(xo) denotes the singleton set containing .x.„. The operators +, *, xor, and in, applied to elements whose type is set, denote the conventional operations of set union, intersection, difference, symmetric dif­ference, and membership. The operators < = and > = denote set inclusion. The specific axioms defining these operators are omitted.

Note that Euclid allows implementations to restrict set types to he built only on base types              with a specified maximum number of elements.

Alodule Types

Because modules are by far the most complex part of Euclid, this section contains a large amount of explanation as well as the technical details of the module rule. The explanatory material can be read now. However, we recommend that Sec­tion 10 on procedures and functions be read before reading the rule and the technical details. An example of the use of the module rule appears as Appendix 3. Additional comments on the module construct appear in the Epilogue. The material on modules appears here because modules, like subrange types and array types, are type constructors.

type T(const C)= pre P1; post QI; module t invariant Q0; imports (var Y const D readonly