Publications

  1. Manuel Fähndrich, Songtao Xia. Establishing Object Invariants with Delayed Types October 2007 Proceedings of the 22nd ACM Conference on Object-Oriented Programming, Systems, and Applications
    Mainstream object-oriented languages such as C# and Java provide an initializationmodel for objects that does not guarantee programmer controlled initialization of fields. Instead, all fields are initialized to default values (0 for scalars and null for non-scalars) on allocation. This is in stark contrast to functional languages, where all parts of an allocation are initialized to programmer-provided values. These choices have a direct impact on two main issues: 1) the prevalence of null in object oriented languages (and its general absence in functional languages), and 2) the ability to initialize circular data structures. This paper explores connections between these differing approaches and proposes a fresh look at initialization. Delayed types are introduced to express and formalize prevalent initialization patterns in object-oriented languages.

  2. Manuel Fähndrich, Mark Aiken, Chris Hawblitzel, Orion Hodson, Galen Hunt, James R. Larus, Steven Levi. Language Support for Fast and Reliable Message-based Communication in Singularity OS April 2006 Proceedings of the EuroSys 2006 Conference 177--190
    Message-based communication offers the potential benefits of providing stronger specification and cleaner separation between components. Compared with shared-memory interactions, message passing has the potential disadvantages of more expensive data exchange (no direct sharing) and more complicated programming. In this paper we report on the language, verification, and run-time system features that make messages practical as the sole means of communication between processes in the Singularity operating system. We show that using advanced programming language and verification techniques, it is possible to provide and enforce strong system-wide invariants that enable efficient communication and low-overhead software- based process isolation. Furthermore, specifications on communication channels help in detecting programmer mistakes early---namely at compile-time---thereby reducing the difficulty of the message-based programming model. The paper describes our communication invariants, the language and verification features that support them, as well as implementation details of the infrastructure. A number of benchmarks show the competitiveness of this approach.

  3. Manuel Fähndrich, Michael Carbin, James R. Larus. Reflective Program Generation with Patterns October 2006 Proceedings of the 2006 Conference on Generative Programming and Component Engineering 275--284
    Runtime reflection facilities, as present in Java and .NET, are powerful mechanisms for inspecting existing code and metadata, as well as generating new code and metadata on the fly. Such power does come at a high price though. The runtime reflection support in Java and .NET imposes a cost on all programs, whether they use reflection or not, simply by the necessity of keeping all metadata around and the inability to optimize code because of future possible code changes. A second---often overlooked---cost is the difficulty of writing correct reflection code to inspect or emit new metadata and code and the risk that the emitted code is not well-formed. In this paper we examine a subclass of problems that can be addressed using a simpler mechanism than runtime reflection, which we call compile-time reflection.We argue for a high-level construct called a transform that allows programmers to write inspection and generation code in a pattern matching and template style, avoiding at the same time the complexities of reflection APIs and providing the benefits of staged compilation in that the generated code and metadata is known to be well-formed and type safe ahead of time.

  4. Mike Barnett, Robert DeLine, Manuel Fähndrich, K. Rustan M. Leino, Wolfram Schulte. Verification of object-oriented programs with invariants June 2004 Journal of Object Technology 6 27-56 3
    An object invariant defines what it means for an object’s data to be in a consistent state. Object invariants are central to the design and correctness of objectoriented programs. This paper defines a programming methodology for using object invariants. The methodology, which enriches a program’s state space to express when each object invariant holds, deals with owned object components, ownership transfer, and subclassing, and is expressive enough to allow many interesting object-oriented programs to be specified and verified. Lending itself to sound modular verification, the methodology also provides a solution to the problem of determining what state a method is allowed to modify.

  5. Robert DeLine, Manuel Fähndrich. Typestates for objects June 2004 ECOOP 2004 --- Object-Oriented Programming, 18th European Conference Martin Odersky 465-490 Lecture Notes in Computer Science 3086
    Today's mainstream object-oriented compilers and tools do not support declaring and statically checking simple pre- and postconditions on methods and invariants on object representations. The main technical problem preventing static verification is reasoning about the sharing relationships among objects as well as where object invariants should hold. We have developed a programming model of typestates for objects with a sound modular checking algorithm. The programming model handles typical aspects of object-oriented programs such as downcasting, virtual dispatch, direct calls, and subclassing. The model also permits subclasses to extend the interpretation of typestates and to introduce additional typestates. We handle aliasing by adapting our previous work on practical linear types developed in the context of the Vault system. We have implemented these ideas in a tool called Fugue for specifying and checking typestates on Microsoft .NET-based programs.

  6. James R. Larus, Thomas Ball, Manuvir Das, Robert DeLine, Manuel Fähndrich, Jon Pincus, Sriram K. Rajamani, Ramanathan Venkatapathy. Righting Software May 2004 IEEE Software 3 92--100 21
    Correctness tools can improve software development by systematically detecting programming errors. Microsoft Research has developed two generations of these tools that help programmers find and fix errors early in the development process.

  7. Mike Barnett, Robert DeLine, Manuel Fähndrich, K. Rustan M. Leino, Wolfram Schulte. Verification of Object-Oriented Programs with Invariants July 2003 Formal Techniques for Java-like Programs 2003 (Proceedings) Susan Eisenbach and Gary T. Leavens and Peter Müller and Arnd Poetzsch-Heffter and Erik Poll 61--68

  8. Manuel Fähndrich, K. Rustan M. Leino. Declaring and checking non-null types in an object-oriented language November 2003 Proceedings of the 2003 ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2003 302--312
    Distinguishing non-null references from possibly-null references at the type level can detect null-related errors in objectoriented programs at compile-time. This paper gives a proposal for retrofitting a language such as C or Java with non-null types. It addresses the central complications that arise in constructors, where declared non-null fields may not yet have been initialized, but the partially constructed object is already accessible. The paper reports experience with an implementation for annotating and checking null-related properties in C programs.

  9. Manuel Fähndrich, K. Rustan M. Leino. Heap monotonic typestate July 2003 Proceedings of the first International Workshop on Alias Confinement and Ownership (IWACO)
    The paper defines the class of heap monotonic typestates. The monotonicity of such typestates enables sound checking algorithms without the need for non-aliasing regimes of pointers. The basic idea is that data structures evolve over time in a manner that only makes their representation invariants grow stronger, never weaker. This assumption guarantees that existing object references with particular typestates remain valid in all program futures, while still allowing objects to attain new stronger typestates. The system is powerful enough to establish properties of circular data structures.

  10. Manuel Fähndrich, Robert DeLine. Adoption and Focus: Practical Linear Types for Imperative Programming May 2002 Proceedings of the 2002 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI) 13-24
    A type system with linearity is useful for checking software protocols and resource management at compile time. Linearity provides powerful reasoning about state changes, but at the price of restrictions on aliasing. The hard division between linear and nonlinear types forces the programmer to make a trade-off between checking a protocol on an object and aliasing the object. Most onerous is the restriction that any type with a linear component must itself be linear. Because of this, checking a protocol on an object imposes aliasing restrictions on any data structure that directly or indirectly points to the object. We propose a new type system that reduces these restrictions with the adoption and focus constructs. Adoption safely allows a programmer to alias objects on which she is checking protocols, and focus allows the reverse. A programmer can alias data structures that point to linear objects and use focus for safe access to those objects. We discuss how we implemented these ideas in the Vault programming language.

  11. Robert DeLine, Manuel Fähndrich. Enforcing high-level protocols in low-level software May 2001 Proceedings of the 2001 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI) 59-69
    The reliability of infrastructure software, such as operating systems and web servers, is often hampered by the mismanagement of resources, such as memory and network connections. The Vault programming language allows a programmer to describe resource management protocols that the compiler can statically enforce. Such a protocol can specify that operations must be performed in a certain order and that certain operations must be performed before accessing a given data object. Furthermore, Vault enforces statically that resources cannot be leaked. We validate the utility of our approach by enforcing protocols present in the interface between the Windows 2000 kernel and its device drivers.

  12. J. Rehof, M. Fähndrich. Type-Based Flow Analysis: From Polymorphic Subtyping to CFL-Reachability January 2001 Proceedings POPL 2001, 28'th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages
    We present a novel approach to scalable implementation of type-based flow analysis with polymorphic subtyping. By giving a new presentation of polymorphic subtyping with instantiation constraints, we are able to apply context-free language (CFL) reachability techniques to type-based flow analysis. We develop a CFL-based algorithm for computing flow information in time $O(n^3)$, where $n$ is the size of the typed program. The algorithm substantially improves upon the best previously known algorithm for this problem with complexity $O(n^8)$. Our technique also yields the first demand-driven algorithm for polymorphic subtype-based flow-computation. It works directly on higher-order programs with structured data, supports context-sensitive, global flow summarization and includes polymorphic recursion.

  13. Fähndrich, M., Rehof, J., Das, M.. Scalable Context-Sensitive Flow Analysis Using Instantiation Constraints June 2000 Proceedings of the 2000 ACM SIGPLAN Conference on Programming Language Design and Implementation
    This paper shows that a type graph (obtained via polymorphic type inference) harbors explicit directional flow paths between functions. These flow paths arise from the instantiations of polymorphic types and correspond to call-return sequences in first-order programs. We show that flow information can be computed efficiently while considering only paths with well matched call-return sequences, even in the higher-order case. Furthermore, we present a practical algorithm for inferring type instantiation graphs and provide empirical evidence to the scalability of the presented techniques by applying them in the context of points-to analysis for C programs.

  14. Jeffrey S. Foster, Manuel Fähndrich, Alexander Aiken. Polymorphic versus Monomorphic Flow-insensitive Points-to Analysis for C June 2000 Proceedings of the 7th International Static Analysis Symposium
    We carry out an experimental analysis for two of the design dimensions of flow-insensitive points-to analysis for C: polymorphic versus monomorphic and equality-based versus inclusion-based. Holding other analysis parameters fixed, we measure the precision of the four design points on a suite of benchmarks of up to 90,000 abstract syntax tree nodes. Our experiments show that the benefit of polymorphism varies significantly with the underlying monomorphic analysis. For our equalitybased analysis, adding polymorphism greatly increases precision, while for our inclusion-based analysis, adding polymorphism hardly makes any difference. We also gain some insight into the nature of polymorphism in points-to analysis of C. In particular, we find considerable polymorphism available in function parameters, but little or no polymorphism in function results, and we show how this observation explains our results.

  15. Zhendong Su, Manuel Fähndrich, Alexander Aiken. Projection Merging: Reducing Redundancies in Inclusion Constraint Graphs January 2000 Proceedings POPL 2000, 27'th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages
    Inclusion-based program analyses are implemented by adding new edges to directed graphs. In most analyses, there are many different ways to add a transitive edge between two nodes, namely through each different path connecting the nodes. This path redundancy limits the scalability of these analyses. We present projection merging, a technique to reduce path redundancy. Combined with cycle elimination, projection merging achieves orders of magnitude speedup of analysis time on programs over that of using cycle elimination alone.

  16. Jeffrey S. Foster, Manuel Fähndrich, Alexander Aiken. A Theory of Type Qualifiers May 1999 Proceedings of the 1999 ACM SIGPLAN Conference on Programming Language Design and Implementation 192--203
    We describe a framework for adding type qualifiers to a language. Type qualifiers encode a simple but highly useful form of subtyping. Our framework extends standard type rules to model the flow of qualifiers through a program, where each qualifier or set of qualifiers comes with additional rules that capture its semantics. Our framework allows types to be polymorphic in the type qualifiers. We present a const-inference system for C as an example application of the framework. We show that for a set of real C programs, many more consts can be used than are actually present in the original code.

  17. M. Fähndrich, J. Foster, Z. Su, A. Aiken. Partial Online Cycle Elimination in Inclusion Constraint Graphs June 1998 Proceedings of the 1998 ACM SIGPLAN Conference on Programming Language Design and Implementation 33:5 85--96 SIGPLAN notices
    Many program analyses are naturally formulated and implemented using inclusion constraints. We present newresults on the scalable implementation of such analyses based on two insights: first, that online elimination of cyclic constraints yields orders-of magnitude improvements in analysis time for large problems; second, that the choice of constraint representation affects the quality and efficiency of online cycle elimination. We present an analytical model that explains our design choices and show that the model's predictions match well with results from a substantial experiment.

  18. Alexander Aiken, Manuel Fähndrich, Zhendong Su. Detecting Races in Relay Ladder Logic Programs March 1998 Proceedings of the 1st Conference on Tools and Algorithms for the Analysis and Construction of Systems
    Relay Ladder Logic (RLL) is a programming language widely used for complex embedded control applications such as manu- facturing and amusement park rides. The cost of bugs in RLL programs is extremely high, often measured in millions of dollars (for shutting down a factory) or human safety (for rides). In this paper, we describe our experience in applying constraint-based program analysis techniques to analyze production RLL programs. Our approach is an interesting combination of probabilistic testing and program analysis, and we show that our system is able to detect bugs with high probability, up to the approximations made by the conservative program analysis. We demon- strate that our analysis is useful in detecting some flaws in production RLL programs that are difficult to find by other techniques.

  19. Manuel Fähndrich, John Boyland. Staticallly Checkable Pattern Abstractions June 1997 Proceedings of the International Conference on Functional Programming (ICFP '97) 75--84
    Pattern abstractions increase the expressiveness of pattern matching, enabling the programmer to describe a broader class of regular forests with patterns. Furthermore, pattern abstractions support code reuse and code factoring, features that facilitate maintenance and evolution of code. Past research on pattern abstractions has generally ignored the aspect of compile-time checks for exhaustiveness and redundancy. In this paper we propose a class of expressive patterns that admits these compile-time checks.

  20. Manuel Fähndrich, Alexander Aiken. Refined Type Inference for ML June 1997 Proceedings of the 1st Workshop on Types in Compilation

  21. Manuel Fähndrich, Alexander Aiken. Program Analysis Using Mixed Term and Set Constraints September 1997 Proceedings of the 4th International Static Analysis Symposium 114--126 Lecture Notes in Computer Science 1302

  22. Manuel Fähndrich, Alex Aiken. Making Set-Constraint Based Program Analyses Scale August 1996 Cambridge, MA First Workshop on Set Constraints at CP'96

  23. A. Aiken, M. Fähndrich. Dynamic Typing and Subtype Inference June 1995 Proceedings of the 1995 Conference on Functional Programming Languages and Computer Architecture 182--191
    Dynamic typing is a program analysis targeted at removing runtime tagging and untagging operations from programs written in dynamically typed languages. This paper compares dynamic typing with a subtyping system based on set constraints. The purpose is both to make precise the relationship between two superficially unrelated type systems and to illustrate how the advantages of dynamic typing and subtype inference can be combined. The central result is a theorem showing that a typing discipline at least as powerful as dynamic typing can be expressed using set constraints.

  24. A. Aiken, M. Fähndrich, R. Levien. Better Static Memory Management: Improving region-based analysis of higher-order languages June 1995 Proceedings of the 1995 ACM SIGPLAN Conference on Programming Language Design and Implementation 174--185
    Static memory management replaces runtime garbage collection with compile-time annotations that make all memory allocation and deallocation explicit in a program. We improve upon the Tofte/Talpin region-based scheme for compile-time memory management [TT94]. In the Tofte/Talpin approach, all values, including closures, are stored in regions. Region lifetimes coincide with lexical scope, thus forming a runtime stack of regions and eliminating the need for garbage collection. We relax the requirement that region lifetimes be lexical. Rather, regions are allocated late and deallocated as early as possible by explicit memory operations. The placement of allocation and deallocation annotations is determined by solving a system of constraints that expresses all possible annotations. Experiments show that our approach reduces memory requirements significantly, in some cases asymptotically.

Technical Reports

  1. Deconstructing Process Isolation
    Mark Aiken, Manuel Fähndrich, Chris Hawblitzel, Galen Hunt, James Larus

    Most operating systems enforce process isolation through hardware protection mechanisms such as memory segmentation, page mapping, and differentiated user and kernel instructions. Singularity is a new operating system that uses software mechanisms to enforce process isolation. A software isolated process (SIP) is a process whose boundaries are established by language safety rules and enforced by static type checking. With proper system support, SIPs can provide a low cost isolation mechanism that provides failure isolation and fast inter-process communication. To compare the performance of Singularity’s approach against more conventional systems, we implemented an optional hardware isolation mechanism. Protection domains are hardware-enforced address spaces, which can contain one or more SIPs. Domains can either run at the kernel’s privilege levels and share an exchange heap or be fully isolated from the kernel and run at the normal application privilege level. These domains can construct Singularity configurations that are similar to micro-kernel and monolithic kernel systems. Hardware-based isolation incurs non-trivial performance costs (up to 25-33%) and complicates system implementation. Software isolation has less than 5% overhead on these benchmarks. The lower cost of SIPs permits them to provide protection and failure isolation at a finer granularity than conventional processes. Singularity’s ability to employ hardware isolation selectively, rather than at every process boundary, supports the construction of more resilient system architectures.

  2. Sealing OS Processes to Improve Dependability and Security
    Galen C. Hunt, Mark Aiken, Paul Barham, Manuel Fähndrich, Chris Hawblitzel, Orion Hodson, James R. Larus, Steven Levi, Nick Murphy, Bjarne Steensgaard, David Tarditi, Ted Wobber, Brian D. Zill

    On most modern operating systems, a process is a hardware-protected abstraction for executing potentially mutable code and data. Common features of processes include: dynamic code loading, dynamic code generation, access to cross-process shared memory, and a universal API. This paper argues that many of the dependability and security weaknesses of modern systems are exacerbated by this open process architecture. Moreover, this architecture impairs the ability of tools to analyze code statically, to improve its performance or dependability. By contrast, a sealed process architecture prohibits dynamic code loading, prohibits self-modifying code, prohibits shared memory, and replaces a universal API with a process-limited API. This paper describes an implementation of a sealed process architecture in the Singularity operating system, discusses its merits, and evaluates its impact. Among the benefits are: improved static program analysis, strong security guarantees, elimination of OS redundancies found in language runtimes such as the JVM and CLR, and better software engineering.

  3. An Overview of the Singularity Project
    Galen Hunt, James Larus, Mart\'in Abadi, Mark Aiken, Paul Barham, Manuel Fähndrich, Chris Hawblitzel, Orion Hodson, Steven Levi, Nick Murphy, Bjarne Steensgaard, David Tarditi, Ted Wobber, Brian Zill

    Singularity is a research project in Microsoft Research that started with the question: what would a software platform look like if it was designed from scratch with the primary goal of dependability? Singularity is working to answer this question by building on advances in programming languages and tools to develop a new system architecture and operating system (named Singularity), with the aim of producing a more robust and dependable software platform. Singularity demonstrates the practicality of new technologies and architectural decisions, which should lead to the construction of more robust and dependable systems.

  4. The Fugue Protocol Checker: Is Your Software Baroque?
    Robert DeLine, Manuel Fähndrich

    Even in a safe programming language, such as C or Java, disobeying the rules for using an interface can cause exceptions at run time. Such rules govern how system resources are managed, the order of method calls, and the formatting of string parameters, such as SQL queries. This paper introduces Fugue, a modular static checker for languages that compile to the Common Language Runtime. Fugue allows the rules for using an interface to be recorded as declarative specifications and provides a range of annotations that allow a developer to specify interface rule with varying precision. At the simplest end of the range, a specifier can mark those methods that allocate and release resources. A specifier can also limit the order in which an object’s methods may be called to the transitions of a finite state machine. At the most complex end of the range, a specifier can give a method a plug-in pre- and postconditon, which is arbitrary code that examines an object’s current state and a static approximation of the method’s actuals, decides whether the call is legal and returns the object’s state after the call. We used these features to specify rules for using ado.net, a library for accessing relational databases, and found several errors in an internal Microsoft Research web site, which extensively uses this library.

  5. Estimating the Impact of Scalable Pointer Analysis on Optimization
    Manuvir Das, Manuel Fähndrich, Ben Liblit, Jakob Rehof

    This paper addresses the following question: Do scalable control-flow-insensitive pointer analyses provide the level of precision required to make them useful in compiler optimizations? We first describe alias frequency, a metric that measures the ability of a pointer analysis to determine that pairs of memory accesses in C programs cannot be aliases. We believe that this kind of information is useful for a variety of optimizations, while remaining independent of a particular optimization. We show that control-flow and context insensitive analyses provide the same answer as the best possible pointer analysis on at least 95% of all statically generated alias queries. In order to understand the potential run-time impact of the remaining 5% queries, we weight the alias queries by dynamic execution counts obtained from profile data. Flow-insensitive pointer analyses are accurate on at least 95% of the weighted alias queries as well. We then examine whether scalable pointer analyses are inaccurate on the remaining 5% alias queries because they are context-insensitive. To this end, we have developed a new context-sensitive pointer analysis that also serves as a general engine for tracing the flow of values in C programs. To our knowledge, it is the first technique for performing context-sensitive analysis with subtyping that scales to millions of lines of code. We find that the new algorithm does not identify fewer aliases than the contextinsensitive analysis.

  6. From Polymorphic Subtyping to CFL Reachability: Context-Sensitive Flow Analysis Using Instantiation Constraints
    Fähndrich, M., Rehof, J., Das, M.

    We present a novel approach to computing context-sensitive flow of values through procedures and data structures. Our approach combines and extends techniques from two seemingly disparate areas: polymorphic subtyping and interprocedural dataflow analysis based on context-free language reachability. The resulting technique offers several advantages over previous approaches: it works directly on higher-order programs, provides demand-driven interprocedural queries, and improves the asymptotic complexity of a known algorithm based on polymorphic subtyping from O(n^8) to O(n^3) for computing all queries. For intra-procedural flow restricted to equivalence classes, our algorithm yields linear inter-procedural flow queries.

  7. Tracking down Exceptions in Standard ML Programs
    M. Fähndrich, J. Foster, Alexander Aiken, Jason Cu

    We describe our experience with an exception analysis tool for Standard ML. Information about exceptions gathered by the analysis is visualized using PAM, a program visualization tool for EMACS. We study the results of the analysis of three well-known programs, classifying exceptions as assertion failures, error exceptions, control-flow exceptions, and pervasive exceptions. Even though the analysis is often conservative and reports many spurious exceptions, we have found it useful for checking the consistency of error and control-flow exceptions. Furthermore, using our tools, we have uncovered two minor exception-related bugs in the three programs we scrutinized.

Miscellaneous