Cracking
the
Vault
an introduction to the Vault Programming Language
We've designed this tutorial to be read by a wide audience, from working programmers to programming language experts. To accommodate this diverse readership, each topics starts with a short technical description with some examples, plus links that allow you to "drill down" for more details if the topic is unfamiliar or especially interesting to you. As a starting point, we assume that you are an experienced programmer with working knowledge of the C programming language, which we often use as a point of reference. Although this document covers all of Vault, the emphasis is on letting you start writing useful programs right away.
Vault is a safe version of the C programming language, with rich types (including closures and algebraic data types), a module system, and a new feature for specifying and checking program resource use (keys). By programming in Vault, a developer can discover at compile time errors that C programmers typically find only through testing (if at all): NULL pointer dereferences; dangling pointers; memory leaks; missing lock protection and lock ordering bugs; and interface misuse, like applying operations in the wrong order or forgetting to call finalizing operations.
Vault and C share a lot in common: they share many basic types (char,
int, short, long,
float, double); they share the same expressions with
the same associativity and precedence; and they share the same statements,
except goto. There are also several minor differences between them,
which fall into the following categories:
Vault provides type bool with values true
and false. Both conditional statements (if, while,
for and do) and logical expressions (!e,
e&&e, and e||e) accept only boolean
expressions.
Vault provides a type string
representing zero-terminated sequences of characters. Literals written in double quotes
("like this") are of type string
(not type char* as in C). The module Buffer
provides the basic functions for manipulating strings.
In a C block, all declarations must appear before the first statement; whereas, in Vault, declarations and statements can be intermixed. The scope of a declared name is from the declaration to the end of the enclosing block. The same name cannot be declared more than once in the same scope.
In Vault, all variables must be initialized. For convenience, many of the basic types provide a default value (false for booleans, zero for arithmetic and floating-point types, the empty string for strings). If a variable is declared to have a type with a default value, the declaration can omit the initializer; the variable is initialized to the type's default value. If the type does not have a default value, the declaration must have an explicit initializer.
As a safe language, Vault does not allow an expression to be typecast
to an arbitrary type. Vault does support certain built-in type
conversions, which have a typecast syntax similar to C++. A developer can use
typecast syntax to convert back and forth
between scalar values of various bit widths: byte;
char; short; int;
long; string.
Like C, Vault provides structures to describe aggregate data; both
the structure type itself and the structure's fields must all be named. As a
light-weight alternative, Vault also provides tuples. Tuples are not
named and their fields are numbered starting from zero. Tuple types and
values are both written in parentheses with comma-separated fields, e.g. the
value (false,3,"hello") has the type (bool,int,string).
Two tuple types are equivalent if they have the same structure.
Vault's structure declarations have a different meaning than those in
C. In C, a struct declaration describes a memory block that
holds the structure's fields. To declare a pointer to this memory block,
a C programmer uses the pointer type
modifier (*). Instead, in
Vault, a struct declaration itself describes a pointer to
the memory block that holds the structure's fields. To refer to the
memory block itself, a Vault programmer uses the flat type
modifier. [Example] (Hence, C's pointer modifier (*) and Vault's flat
modifier are duals.)
This design makes Vault's structures analogous to
arrays in C and Vault.
For instance, with the C declaration int x[3], the name x
refers to a pointer to the memory block that holds the array elements,
not to the memory block itself. Similarly, given a structure
declaration struct s and the variable declaration s x,
the name x refers to a pointer to a memory block holding
the structure's fields, not to the memory block itself.
Unlike C pointers, the pointer inherent in a struct
declaration cannot be NULL. To allow a pointer to be NULL, a Vault programmer uses
the ? type modifier. Given the variable declaration s?
x, the name x is a possibly-NULL pointer to a
structure s. A switch statement can be used to tell whether
such a pointer is NULL. [Example]
Finally, given any type T, the parameterized type ref<T>
is like the C pointer declaration T*, except that the
pointer cannot be NULL. The type ref is handy for passing
function parameters by reference.
[More detail]
The C type modifier const can be used to make both
variables and the targets of pointers immutable. For example, given the
declaration const char * const str, neither the variable str
nor the contents of the string to which str refers may be
updated. Vault provides a more limited modifier const,
which only allows variables to be declared immutable. The const
keyword must directly precede the name being declared immutable, e.g. float
const pi = 3.14159.
[More detail]
Vault functions are first-class: they can be created at run time, stored in
data structures, passed as function parameters, and returned as function
results. A Vault function is a closure that captures its environment. The value
of a captured variable is its value at the point at which the
function is defined. [Example] A captured
variable is implicitly declared const and cannot be updated in the
function that captures it. [Example] To
capture a variable that can be updated, the variable can be declared as a
mutable reference, that is, a variable of type ref<T> for
some type T. [Example] Vault provides syntactic sugar for functions that return others
functions; this syntactic sugar is called curried function syntax. A
function written in curried function syntax can be partially called, just like
the equivalent function in standard syntax. [Example]
Vault supports algebraic data types, called variants. A variant is like a C union, plus a tag whose value is guaranteed to reflect which piece of data is stored in the union at a given program point. A variant type consists of a set of constructors, each of which is a literal name (an identifier preceded with a back quote) and an optional list of data associated with that constructor. [Example] Because constructor names are literals, the same constructor name can appear in any number of type definitions. To test what kind of value a variant currently holds, a programmer uses a switch statement. [Example] Vault's switch statement supports pattern matching, which includes the following patterns: the "don't care" pattern; literal patterns (for arithmetic types, floating-point types, and strings); name patterns; variant patterns; and tuple patterns [Example]. These atomic patterns can be nested to form more complex patterns. [Example]
To support libraries of reusable algorithms and data structures, Vault allows
functions and type definitions to be parameterized by types. Such functions and
type definitions are called generic. Type parameters for functions are
written in angle brackets after the function's name. [Example]
Similarly, type parameters for type definitions are written in angle brackets
after the type's name. [Example]
A function or type definition may have any number of type parameters. Because
generic functions are compiled once (and not once per instantiation, as with
C++), a type parameter must always be instantiated on a type of a fixed size,
whose default is 32 bits. Since a type parameter can be instantiated on any
type, the only operations available on the type parameter are assignment (=),
equality (==), and inequality (!=), where the latter
two are bitwise comparisons. [Example]
Vault provides a module system to support decomposing large programs into
smaller pieces. A module definition has its own name space. A module M's
definition of x is referenced as M.x outside of module
M. All module-level definitions are mutually recursive; all
module-level names are simultaneously in scope. [Example]
Module definitions can be nested arbitrarily.
An interface declaration describes the contents of a module. Ascribing an interface to a module has two effects: first, the module must contain a definition that satisfies each specification in the interface; second, the module's publicly exported definitions are restricted to those mentioned in the interface. [Example] (If a module is not ascribed any interface, all of its module-level definitions are exported.) When a module defines a type and has an interface specifying that the type is opaque, then the type's original definition is forgotten and it is considered distinct from all other types. [Example] Opaque types support the principle of information hiding.
A module defined in one source file refers to an interface or module defined in a different source file with an external interface or module declaration. [Example] An interface can be ascribed to an external module, just like any other module.
A type system in a programming language performs several duties. Types control what the allowed operations on a particular data item are. For example, given a value x of struct type with a field f, it is valid to form the index x.f. Thus at the language level, types control what operations are applicable to which classes of values. Second, types help the compiler generate appropriate code for manipulating data values.
In Vault, we add an important new dimension to the type system: resource management and access control. These features allow the programmer to express behavioral aspects of their code and have the compiler check for conformance automatically at compile time.