Publications
Concurrent Revisions

Sebastian Burckhardt, Daan Leijen, Caitlin Sadowski, Jaeheon Yi, and Thomas Ball, Two for the Price of One: A Model for Parallel and Incremental Computation, in Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA'11), ACM SIGPLAN, Portland, Oregon, 22 October 2011

Parallel or incremental versions of an algorithm can significantly outperform their counterparts, but are often difficult to develop. Programming models that provide appropriate abstractions to decompose data and tasks can simplify parallelization. We show in this work that the same abstractions can enable both parallel and incremental execution. We present the first known algorithm for parallel self-adjusting computation. This algorithm extends a deterministic parallel programming model (concurrent revisions) with support for recording and repeating computations. On record, we construct a dynamic dependence graph of the parallel computation. On repeat, we reexecute only parts whose dependencies have changed. We implement and evaluate our idea by studying five example programs, including a realistic multi-pass CSS layout algorithm. We describe programming techniques that proved particularly useful to improve the performance of self-adjustment in practice. Our final results show significant speedups on all examples (up to 37x on an 8-core machine). These speedups are well beyond what can be achieved by parallelization alone, while requiring a comparable effort by the programmer.

This paper won an OOPSLA Distinguished Paper Award.

Daan Leijen, Sebastian Burckhardt, and Manuel Fahndrich, Prettier Concurrency: Purely Functional Concurrent Revisions, in Haskell Symposium 2011 (Haskell'11), ACM SIGPLAN, Tokyo, Japan, 7 July 2011

This article presents an extension to the work of Launchbury and Peyton-Jones on the ST monad. Using a novel model for concurrency, called "concurrent revisions", we show how we can use concurrency together with imperative mutable variables, while still being able to safely convert such computations (in the Rev monad) into pure values again.

In contrast to many other transaction models, like software transactional memory (STM), concurrent revisions never use rollback and always deterministically resolve conflicts. As a consequence, concurrent revisions integrate well with side-effecting I/O operations. Using deterministic conflict resolution, concurrent revisions can deal well with situations where there are many conflicts between different threads that modify a shared data structure. We demonstrate this by describing a concurrent game with conflicting concurrent tasks.

Sebastian Burckhardt and Daan Leijen, Semantics of Concurrent Revisions, in European Symposium on Programming (ESOP'11), Springer Verlag, Saarbrucken, Germany, March 2011

Enabling applications to execute various tasks in parallel is difficult if those tasks exhibit read and write conflicts. We recently developed a programming model based on concurrent revisions that addresses this challenge in a novel way: each forked task gets a conceptual copy of all the shared state, and state changes are integrated only when tasks are joined, at which time write-write conflicts are deterministically resolved. In this paper, we study the precise semantics of this model, in particular its guarantees for determinacy and consistency. First, we introduce a revision calculus that concisely captures the programming model. Despite allowing concurrent execution and locally nondeterministic scheduling, we prove that the calculus is confluent and guarantees determinacy. We show that the consistency guarantees of our calculus are a logical extension of snapshot isolation with support for conflict resolution and nesting. Moreover, we discuss how custom merge functions can provide stronger guarantees for particular data types that are tailored to the needs of the application. Finally, we show we can visualize the nonlinear history of state in our computations using revision diagrams that clarify the synchronization between tasks and allow local reasoning about state updates.

An extended version of this paper is available as TechReport MSR-TR-2010-94. (see link below)

Sebastian Burckhardt, Daan Leijen, and Manuel Fahndrich, Roll Forward, Not Back: A Case for Deterministic Conflict Resolution, in The 2nd Workshop on Determinism and Correctness in Parallel Programming (WODET'11), Newbeach, California, March 2011

Enabling applications to execute various tasks in parallel is difficult if those tasks exhibit read and write conflicts. In recent work, we developed a programming model based on concurrent revisions that addresses this challenge: each forked task gets a conceptual copy of all locations that are declared to be shared. Each such location has a specific isolation type; on joins, state changes to each location are merged deterministically based on its isolation type. In this paper, we study how to specify isolation types abstractly using operation-based compensation functions rather than state based merge functions. Using several examples including a list with insert, delete and modify operations, we propose compensation tables as a concise, general and intuitively accessible mechanism for determining how to merge arbitrary operation sequences. Finally, we provide sufficient conditions to verify that a state-based merge function correctly implements a compensation table.

Sebastian Burckhardt, Alexandro Baldassion, and Daan Leijen, Concurrent Programming with Revisions and Isolation Types, in Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA'10), ACM SIGPLAN, Reno, NV, October 2010

Building applications that are responsive and can exploit parallel hardware while remaining simple to write, understand, test, and maintain, poses an important challenge for developers. In particular, it is often desirable to enable various tasks to read or modify shared data concurrently without requiring complicated locking schemes that may throttle concurrency and introduce bugs. We introduce a mechanism that simplifies the use of concurrent heterogeneous tasks. Programmers declare what data they wish to share between tasks by using isolation types, and execute tasks concurrently by forking and joining revisions. These revisions are isolated: they read and modify their own private copy of the shared data only. A runtime creates and merges copies automatically, and resolves conflicts deterministically, in a manner declared by the chosen isolation type. To demonstrate the practical viability of our approach, we developed an efficient algorithm and an implementation in the form of a C# library, and used it to parallelize an interactive game application. Our results show that the parallelized game, while simple and very similar to the original sequential game, achieves satisfactory speedups on a multicore processor.

Daan Leijen, Wolfram Schulte, and Sebastian Burkhardt, The Design of a Task Parallel Library, in Proceeding of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications (OOPSLA'09), ACM SIGPLAN, Atlanta, FL, September 2009

The Task Parallel Library (TPL) is a library for .NET that makes it easy to take advantage of potential parallelism in a program. The library relies heavily on generics and delegate expressions to provide custom control structures expressing structured parallelism such as map-reduce in user programs. The library implementation is built around the notion of a task as a finite CPU-bound computation. To capture the ubiquitous apply-to-all pattern the library also introduces the novel concept of a replicable task. Tasks and replicable tasks are assigned to threads using work stealing techniques, but unlike traditional implementations based on the THE protocol, the library uses a novel data structure called a 'duplicating queue'. A surprising feature of duplicating queues is that they have sequentially inconsistent behavior on architectures with weak memory models, but capture this non-determinism in a benign way by sometimes duplicating elements. TPL ships as part of the Microsoft Parallel Extensions for the .NET framework 4.0, and forms the foundation of Parallel LINQ queries (however, note that the productized TPL library may differ in significant ways from the basic design described in this article).

Effects

Daan Leijen, Koka: Programming with Row-Polymorphic Effect Types, no. MSR-TR-2013-79, 28 August 2013

Note: This is an updated article: a previous version of this article contained a wrong lemma and corresponding mistakes in various proofs of Section 5.

We propose a programming model where effects are treated in a disciplined way, and where the potential side-effects of a function are apparent in its type signature. The type and effect of expressions can also be inferred automatically, and we describe a polymorphic type inference system based on Hindley-Milner style inference. A novel feature is that we support polymorphic effects through row-polymorphism using duplicate labels. Moreover, we show that our effects are not just syntactic labels but have a deep semantic connection to the program. For example, if an expression can be typed without an exn effect, then it will never throw an unhandled exception. Similar to Haskell's `runST` we show how we can safely encapsulate stateful operations. Through the state effect, we can also safely combine state with let- polymorphism without needing either imperative type variables or a syntactic value restriction. Finally,our system is implemented fully in a new language called Koka and has been used successfully on various small to medium- sized sample programs ranging from a Markdown processor to a tier-splitted chat application.

Nikhil Swamy, Nataliya Guts, Daan Leijen, and Michael Hicks, Lightweight Monadic Programming in ML, in Proceedings of the International Conference of Functional Programming (ICFP'11), ACM SIGPLAN, Tokyo, Japan, 1 July 2011

Many useful programming constructions can be expressed as monads. Examples include support for probabilistic computations, time-varying expressions, parsers, and information flow tracking, not to mention effectful features like state and I/O. In this paper, we present a type-based rewriting algorithm to make programming with arbitrary monads as easy as using ML's built-in support for state and I/O. Programmers write programs using monadic values of type M t as if they were of type t, and our algorithm inserts the necessary binds, units, and monad-to-monad morphisms so that the program typechecks. Our algorithm is based on Jones' qualified types and enjoys three useful properties: (1) principal types, i.e., the rewriting we perform is the most general; (2) coherence, i.e., thanks to the monad and morphism laws, all instances of the principal rewriting have the same semantics; (3) decidability; i.e., the solver for generated constraints will always terminate. Throughout the paper we present simple examples from the domains listed above. Our most complete example, which illustrates the expressive power of our system, proves that ML programs rewritten by our algorithm to use the information flow monad are equivalent to programs in FlowCaml, a domain-specific information flow tracking language.

Nikhil Swamy, Nataliya Guts, Daan Leijen, and Michael Hicks, Lightweight Monadic Programming in ML, no. MSR-TR-2011-39, 24 March 2011

Many useful programming constructions can be expressed as monads. Examples include support for probabilistic computations, time-varying expressions, parsers, and information flow tracking, not to mention effectful features like state and I/O. In this paper, we present a type-based rewriting algorithm to make programming with arbitrary monads as easy as using ML's built-in support for state and I/O. Programmers write programs using monadic values of type M t as if they were of type t, and our algorithm inserts the necessary binds, units, and monad-to-monad morphisms so that the program typechecks. Our algorithm is based on Jones' qualified types and enjoys three useful properties: (1) principal types, i.e., the rewriting we perform is the most general; (2) coherence, i.e., thanks to the monad and morphism laws, all instances of the principal rewriting have the same semantics; (3) decidability; i.e., the solver for generated constraints will always terminate. Throughout the paper we present simple examples from the domains listed above. Our most complete example, which illustrates the expressive power of our system, proves that ML programs rewritten by our algorithm to use the information flow monad are equivalent to programs in FlowCaml, a domain-specific information flow tracking language.

Ross Tate, Daan Leijen, and Sorin Lerner, A Flexible Semantic Framework for Effects , July 2010

Note: this is an unpublished manuscript.

Effects are a powerful and convenient component of programming. They enable programmers to interact with the user, take advantage of efficient stateful memory, throw exceptions, and nondeterministically execute programs in parallel. However, they also complicate every aspect of reasoning about a program or language, and as a result it is crucially important to have a good understanding of what effects are and how they work. In this paper we present a new framework for formalizing the semantics of effects that is more general and thorough than previous techniques while clarifying many of the important concepts. By returning to the category theoretic roots of monads, our framework is rich enough to describe the semantics of effects for a large class of languages including common imperative and functional languages. It is also capable of capturing more expressive, precise, and practical effect systems than previous approaches. Finally, our framework enables one to reason about effects in a language-independent manner, and so can be applied to many stages of language design and implementation in order to create more broadly applicable tools for programming languages.

Daan Leijen and Ross Tate, Convenient Explicit Effects using Type Inference with Subeffects, no. MSR-TR-2010-80, 22 June 2010

Most programming languages in use today let one freely use arbitrary (side) effects. This is despite the fact that unknown and unrestricted side effects are the cause of many software problems. We propose a programming model where effects are treated in a disciplined way, and where the potential side-effects of a function are apparent in its type signature. In contrast to most effect systems that are meant for internal compiler optimizations, our system is designed to be used by the programmer. Inspired by Haskell, we use a coarse-grained hierarchy of effects, like |Pure| and |IO|, which makes it convenient to read and write type signatures. The type and effect of expressions can also be inferred automatically, and we describe a polymorphic type inference system based on Hindley-Milner style inference.

Inference for first-class polymorphism

The following papers on HML and HMF both describe two interesting corners in the design space of type inference systems for first-class polymorphism:

  • The MLF system (by Didier Remy and Didier LeBotlan) represents an upper bound: it arguably requires the least annotations possible while being very expressive (it requires only annotations on function parameters that are used polymorphically). However, it is also the most complex using rigidly bound types and a relatively involved type inference algorithm.
  • The HMF paper describes a lower bound in the design space: it uses just regular System-F types and type inference is just a small extension of the usual algorithm W. It is a lower bound in the sense that it would be hard to come up with a simpler system that has the same expressiveness without requiring more annotations. This is the system you would use if you want first-class polymorphism with minimal implementation effort. There is a reference implementation of the type system written in Haskell that contains many examples. It is available in two variants, one using updateable references (prototype-ref), and one using explicit substitutions (prototype-subst).
  • The Flexible types paper (or HML) describes a type system that lies between these design points: it is a simplification of MLF that uses just flexible types and it has a very simple annotation rule: only function parameters that are polymorphic need an annotation. It is arguably the most expressive system possible when using types with flexible bounds only. This is the system you would use if you want powerful support for first-class polymorphism and are willing to put more effort into the implementation.

    There is a small but complete reference implementation of the type system written in Haskell that contains many examples.

Daan Leijen, Flexible types: robust type inference for first-class polymorphism, in 36th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL'09), ACM SIGPLAN, Savannah, Georgia, January 2009

We present HML, a type inference system that supports full firstclass polymorphism where few annotations are needed: only function parameters with a polymorphic type need to be annotated. HML is a simplification of MLF where only flexibly quantified types are used. This makes the types easier to work with from a programmers perspective, and simplifies the implementation of the type inference algorithm. Still, HML retains much of the expressiveness of MLF, it is robust with respect to small program transformations, and has a simple specification of the type rules with an effective type inference algorithm that infers principal types.

Extended version available as Microsoft Research technical report MSR-TR-2008-55, Mar 2008 (see link below)

Daan Leijen, HMF: Simple Type Inference for First-Class Polymorphism, in 13th ACM symp. of the International Conference on Functional Programming (ICFP'08), Association for Computing Machinery, Inc., Victoria, BC, Canada, September 2008

HMF is a conservative extension of Hindley-Milner type inference with first-class polymorphism and regular System F types. The system distinguishes itself from other proposals with simple type rules and a very simple type inference algorithm that is just a small extension of the usual Damas-Milner algorithm. Given the relative simplicity and expressive power, we feel that HMF can be a very attractive type system in practice.

Extended version with proofs available as Microsoft Research technical report MSR-TR-2007-118, Sep 2007 (see link below)

Daan Leijen, Flexible types: robust type inference for first-class polymorphism, no. MSR-TR-2008-55, March 2008

Daan Leijen, HMF: Simple Type Inference for First-Class Polymorphism, no. MSR-TR-2007-118, 30 October 2007

Daan Leijen, A Type Directed Translation from MLF to System F, in Proceedings of the ACM International Conference on Functional Programming (ICFP'07), ACM SIGPLAN, Freiburg, Germany, 1 October 2007

The MLF type system by Le Botlan and Rémy (2003) is a natural extension of Hindley-Milner type inference that supports full firstclass polymorphism, where types can be of higher-rank and impredicatively instantiated. Even though MLF is theoretically very attractive, it has not seen widespread adoption. We believe that this partly because it is unclear how the rich language of MLF types relate to standard System F types. In this article we give the first type directed translation of MLF terms to System F terms. Based on insight gained from this translation, we also define “Rigid MLF” (MLF =), a restriction of MLF where all bound values have a System F type. The expressiveness of MLF = is the same as that of boxy types, but MLF = needs fewer annotations and we give a detailed comparison between them.

Daan Leijen and Andres Löh, Qualified types for MLF, in The International Conference on Functional Programming (ICFP'05), ACM SIGPLAN, Tallin, Estonia, September 2005

ML-F is a type system that extends a functional language with impredicative higher-rank polymorphism. Type inference remains possible and only in some clearly defined situations, a local type annotation is required. Qualified types are a general concept that can accomodate a wide range of type systems extension, for example, type classes in Haskell. We show how the theory of qualified types can be used seamlessly with the higher-ranked impredicative polymorphism of ML-F. A core contribution of the paper is that we show a solution to the non-trivial problem of evidence translation in the presence of impredicative data types. (Note: one can experiment with MLF types with the experimental Morrow interpreter. However, this does not yet implement the qualified type extension discussed in the paper)

All publications

2014

Daan Leijen, Koka: Programming with Row Polymorphic Effect Types, in Mathematically Structured Functional Programming 2014, EPTCS, 18 March 2014

We propose a programming model where effects are treated in a disciplined way, and where the potential side-effects of a function are apparent in its type signature. The type and effect of expressions can also be inferred automatically, and we describe a polymorphic type inference system based on Hindley-Milner style inference. A novel feature is that we support polymorphic effects through row-polymorphism using duplicate labels. Moreover, we show that our effects are not just syntactic labels but have a deep semantic connection to the program. For example, if an expression can be typed without an exn effect, then it will never throw an unhandled exception. Similar to Haskell's `runST` we show how we can safely encapsulate stateful operations. Through the state effect, we can also safely combine state with let-polymorphism without needing either imperative type variables or a syntactic value restriction. Finally, our system is implemented fully in a new language called Koka and has been used successfully on various small to medium-sized sample programs ranging from a Markdown processor to a tier-splitted chat application. You can try out Koka live at http://www.rise4fun.com/koka/tutorial.

Sebastian Burckhardt, Daan Leijen, and Manuel Fahndrich, Cloud Types: Robust Abstractions for Replicated Shared State, no. MSR-TR-2014-43, March 2014

In the age of cloud-connected mobile devices, users want responsive apps that read and write shared data everywhere, at all times, even if network connections are slow or unavailable. Cloud types have been proposed as a solution that lets programmers declare, read, and update shared structured data while hiding tricky consistency issues related to update propagation and conflict resolution. However, previous work on cloud types does not satisfactorily address (1) how to best understand the weak consistency model, (2) where consistency pitfalls continue to lurk, and (3) how to implement the system efficiently and reliably.

We address these questions by (1) introducing the GLUT model (global log of update transactions), suitable as a mental reference model that helps programmers to visualize executions and reason about correctness, (2) describing typical consistency pitfalls (anti-patterns) and how to avoid them using cloud types, and (3) an efficient client-server implementation of GLUT that uses robust streaming and optimal delta reduction.

2013

Daan Leijen, Koka: Programming with Row-Polymorphic Effect Types, no. MSR-TR-2013-79, 28 August 2013

Note: This is an updated article: a previous version of this article contained a wrong lemma and corresponding mistakes in various proofs of Section 5.

We propose a programming model where effects are treated in a disciplined way, and where the potential side-effects of a function are apparent in its type signature. The type and effect of expressions can also be inferred automatically, and we describe a polymorphic type inference system based on Hindley-Milner style inference. A novel feature is that we support polymorphic effects through row-polymorphism using duplicate labels. Moreover, we show that our effects are not just syntactic labels but have a deep semantic connection to the program. For example, if an expression can be typed without an exn effect, then it will never throw an unhandled exception. Similar to Haskell's `runST` we show how we can safely encapsulate stateful operations. Through the state effect, we can also safely combine state with let- polymorphism without needing either imperative type variables or a syntactic value restriction. Finally,our system is implemented fully in a new language called Koka and has been used successfully on various small to medium- sized sample programs ranging from a Markdown processor to a tier-splitted chat application.

Thomas Ball, Peli de Halleux, Daan Leijen, and Nikhil Swamy, Increasing Human-Tool Interaction via the Web, in 11th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, ACM, June 2013

Software tools researchers can accelerate their ability to learn by exposing tools to users via web technologies, allowing them to observe and test the interactions between humans and tools. At Microsoft Research, we have developed a web service (http: //www.rise4fun.com/) for such a purpose that is available for community use.

2012

Sebastian Burckhardt, Manuel Fahndrich, Daan Leijen, and Benjamin P. Wood, Cloud Types for Eventual Consistency, in Proceedings of the 26th European Conference on Object-Oriented Programming (ECOOP), Springer, 15 June 2012

Mobile devices commonly access shared data stored on a server. To ensure responsiveness, many applications maintain local replicas of the shared data that remain instantly accessible even if the server is slow or temporarily unavailable. Despite its apparent simplicity and commonality, this scenario can be surprisingly challenging. In particular, a correct and reliable implementation of the communication protocol and the conflict resolution to achieve eventual consistency is daunting even for experts. To make eventual consistency more programmable, we propose the use of specialized cloud data types. These cloud types provide eventually consistent storage at the programming language level, and thus abstract the numerous implementation details (servers, networks, caches, protocols). We demonstrate (1) how cloud types enable simple programs to use eventually consistent storage without introducing undue complexity, and (2) how to provide cloud types using a system and protocol comprised of multiple servers and clients.

Manuel Fahndrich, Michael Barnett, Daan Leijen, and Francesco Logozzo, Integrating a Set of Contract Checking Tools into Visual Studio, in Proceedings of the 2012 Second International Workshop on Developing Tools as Plug-ins (TOPI 2012), IEEE, 3 June 2012

Integrating tools and extensions into existing languages,

compilers, debuggers, and IDEs can be difficult, workintensive,

and often results in a one-off integration. In this

paper, we report on our experience of building and integrating

the CodeContract tool set into an existing programming

environment. The CodeContract tools enable 1) authoring of

contracts (preconditions, postconditions, and object invariants),

  1. Instrumenting contract checks into code, 3) statically checking

code against contracts, and 4) visualizing contracts and

results. We identify three characteristics of our integration that

allowed us to reuse existing compilers and IDEs, increase the

reach of our tools to multiple languages and target platforms,

and maintain the tools over three consecutive versions of C#

and Visual Studio with little effort. These principles are 1)

use source embedding for new language features, 2) use target

analysis and rewriting, and 3) use generic plug-ins to isolate

tools from the IDE.

Sebastian Burckhardt, Manuel Fahndrich, Daan Leijen, and Mooly Sagiv, Eventually Consistent Transactions, in Proceedings of the 22n European Symposium on Programming (ESOP), Springer, 24 March 2012

When distributed clients query or update shared data, eventual consistency can provide better availability than strong consistency models. However, programming and implementing such systems can be difficult unless we establish a reasonable consistency model, i.e. some minimal guarantees that programmers can understand and systems can provide effectively. To this end, we propose a novel consistency model based on eventually consistent transactions. Unlike serializable transactions, eventually consistent transactions are ordered by two order relations (visibility and arbitration) rather than a single order relation. To demonstrate that eventually consistent transactions can be effectively implemented, we establish a handful of simple operational rules for managing replicas, versions and updates, based on graphs called revision diagrams. We prove that these rules are sufficient to guarantee correct implementation of eventually consistent transactions. Finally, we present two operational models (single server and server pool) of systems that provide eventually consistent transactions.

2011

Sebastian Burckhardt, Daan Leijen, Caitlin Sadowski, Jaeheon Yi, and Thomas Ball, Two for the Price of One: A Model for Parallel and Incremental Computation, in Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA'11), ACM SIGPLAN, Portland, Oregon, 22 October 2011

Parallel or incremental versions of an algorithm can significantly outperform their counterparts, but are often difficult to develop. Programming models that provide appropriate abstractions to decompose data and tasks can simplify parallelization. We show in this work that the same abstractions can enable both parallel and incremental execution. We present the first known algorithm for parallel self-adjusting computation. This algorithm extends a deterministic parallel programming model (concurrent revisions) with support for recording and repeating computations. On record, we construct a dynamic dependence graph of the parallel computation. On repeat, we reexecute only parts whose dependencies have changed. We implement and evaluate our idea by studying five example programs, including a realistic multi-pass CSS layout algorithm. We describe programming techniques that proved particularly useful to improve the performance of self-adjustment in practice. Our final results show significant speedups on all examples (up to 37x on an 8-core machine). These speedups are well beyond what can be achieved by parallelization alone, while requiring a comparable effort by the programmer.

This paper won an OOPSLA Distinguished Paper Award.

Sebastian Burckhardt, Manuel Fahndrich, Daan Leijen, and Mooly Sagiv, Eventually Consistent Transactions (full version), no. MSR-TR-2011-117, October 2011

When distributed clients query or update shared data, eventual consistency can provide better availability than strong consistency models. However, programming and implementing such systems can be difficult unless we establish a reasonable consistency model, i.e. some minimal guarantees that programmers can understand and systems can provide effectively.

To this end, we propose a novel consistency model based on eventually consistent transactions. Unlike serializable transactions, eventually consistent transactions are ordered by two order relations (visibility and arbitration) rather than a single order relation. To demonstrate that eventually consistent transactions can be effectively implemented, we establish a handful of simple operational rules for managing replicas, versions and updates, based on graphs called revision diagrams. We prove that these rules are sufficient to guarantee correct implementation of eventually consistent transactions. Finally, we present two operational models (single server and server pool) of systems that provide eventually consistent transactions.

Daan Leijen, Sebastian Burckhardt, and Manuel Fahndrich, Prettier Concurrency: Purely Functional Concurrent Revisions, in Haskell Symposium 2011 (Haskell'11), ACM SIGPLAN, Tokyo, Japan, 7 July 2011

This article presents an extension to the work of Launchbury and Peyton-Jones on the ST monad. Using a novel model for concurrency, called "concurrent revisions", we show how we can use concurrency together with imperative mutable variables, while still being able to safely convert such computations (in the Rev monad) into pure values again.

In contrast to many other transaction models, like software transactional memory (STM), concurrent revisions never use rollback and always deterministically resolve conflicts. As a consequence, concurrent revisions integrate well with side-effecting I/O operations. Using deterministic conflict resolution, concurrent revisions can deal well with situations where there are many conflicts between different threads that modify a shared data structure. We demonstrate this by describing a concurrent game with conflicting concurrent tasks.

Nikhil Swamy, Nataliya Guts, Daan Leijen, and Michael Hicks, Lightweight Monadic Programming in ML, in Proceedings of the International Conference of Functional Programming (ICFP'11), ACM SIGPLAN, Tokyo, Japan, 1 July 2011

Many useful programming constructions can be expressed as monads. Examples include support for probabilistic computations, time-varying expressions, parsers, and information flow tracking, not to mention effectful features like state and I/O. In this paper, we present a type-based rewriting algorithm to make programming with arbitrary monads as easy as using ML's built-in support for state and I/O. Programmers write programs using monadic values of type M t as if they were of type t, and our algorithm inserts the necessary binds, units, and monad-to-monad morphisms so that the program typechecks. Our algorithm is based on Jones' qualified types and enjoys three useful properties: (1) principal types, i.e., the rewriting we perform is the most general; (2) coherence, i.e., thanks to the monad and morphism laws, all instances of the principal rewriting have the same semantics; (3) decidability; i.e., the solver for generated constraints will always terminate. Throughout the paper we present simple examples from the domains listed above. Our most complete example, which illustrates the expressive power of our system, proves that ML programs rewritten by our algorithm to use the information flow monad are equivalent to programs in FlowCaml, a domain-specific information flow tracking language.

Nikhil Swamy, Nataliya Guts, Daan Leijen, and Michael Hicks, Lightweight Monadic Programming in ML, no. MSR-TR-2011-39, 24 March 2011

Many useful programming constructions can be expressed as monads. Examples include support for probabilistic computations, time-varying expressions, parsers, and information flow tracking, not to mention effectful features like state and I/O. In this paper, we present a type-based rewriting algorithm to make programming with arbitrary monads as easy as using ML's built-in support for state and I/O. Programmers write programs using monadic values of type M t as if they were of type t, and our algorithm inserts the necessary binds, units, and monad-to-monad morphisms so that the program typechecks. Our algorithm is based on Jones' qualified types and enjoys three useful properties: (1) principal types, i.e., the rewriting we perform is the most general; (2) coherence, i.e., thanks to the monad and morphism laws, all instances of the principal rewriting have the same semantics; (3) decidability; i.e., the solver for generated constraints will always terminate. Throughout the paper we present simple examples from the domains listed above. Our most complete example, which illustrates the expressive power of our system, proves that ML programs rewritten by our algorithm to use the information flow monad are equivalent to programs in FlowCaml, a domain-specific information flow tracking language.

Sebastian Burckhardt, Daan Leijen, and Manuel Fahndrich, Roll Forward, Not Back: A Case for Deterministic Conflict Resolution, in The 2nd Workshop on Determinism and Correctness in Parallel Programming (WODET'11), Newbeach, California, March 2011

Enabling applications to execute various tasks in parallel is difficult if those tasks exhibit read and write conflicts. In recent work, we developed a programming model based on concurrent revisions that addresses this challenge: each forked task gets a conceptual copy of all locations that are declared to be shared. Each such location has a specific isolation type; on joins, state changes to each location are merged deterministically based on its isolation type. In this paper, we study how to specify isolation types abstractly using operation-based compensation functions rather than state based merge functions. Using several examples including a list with insert, delete and modify operations, we propose compensation tables as a concise, general and intuitively accessible mechanism for determining how to merge arbitrary operation sequences. Finally, we provide sufficient conditions to verify that a state-based merge function correctly implements a compensation table.

Sebastian Burckhardt and Daan Leijen, Semantics of Concurrent Revisions, in European Symposium on Programming (ESOP'11), Springer Verlag, Saarbrucken, Germany, March 2011

Enabling applications to execute various tasks in parallel is difficult if those tasks exhibit read and write conflicts. We recently developed a programming model based on concurrent revisions that addresses this challenge in a novel way: each forked task gets a conceptual copy of all the shared state, and state changes are integrated only when tasks are joined, at which time write-write conflicts are deterministically resolved. In this paper, we study the precise semantics of this model, in particular its guarantees for determinacy and consistency. First, we introduce a revision calculus that concisely captures the programming model. Despite allowing concurrent execution and locally nondeterministic scheduling, we prove that the calculus is confluent and guarantees determinacy. We show that the consistency guarantees of our calculus are a logical extension of snapshot isolation with support for conflict resolution and nesting. Moreover, we discuss how custom merge functions can provide stronger guarantees for particular data types that are tailored to the needs of the application. Finally, we show we can visualize the nonlinear history of state in our computations using revision diagrams that clarify the synchronization between tasks and allow local reasoning about state updates.

An extended version of this paper is available as TechReport MSR-TR-2010-94. (see link below)

2010

Caitlin Sadowski and Daan Leijen, A Preliminary Survey on Functional Programming, no. MSR-TR-2010-147, 12 November 2010

Functional programming has had a profound impact on the development of mainstream languages such as C# or Java.We wanted to get a better sense of developer’s perceptions of functional programming, and also better understand which functional programming concepts are useful to developers. This paper reports the results of a preliminary survey on this topic.

Sebastian Burckhardt, Alexandro Baldassion, and Daan Leijen, Concurrent Programming with Revisions and Isolation Types, in Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA'10), ACM SIGPLAN, Reno, NV, October 2010

Building applications that are responsive and can exploit parallel hardware while remaining simple to write, understand, test, and maintain, poses an important challenge for developers. In particular, it is often desirable to enable various tasks to read or modify shared data concurrently without requiring complicated locking schemes that may throttle concurrency and introduce bugs. We introduce a mechanism that simplifies the use of concurrent heterogeneous tasks. Programmers declare what data they wish to share between tasks by using isolation types, and execute tasks concurrently by forking and joining revisions. These revisions are isolated: they read and modify their own private copy of the shared data only. A runtime creates and merges copies automatically, and resolves conflicts deterministically, in a manner declared by the chosen isolation type. To demonstrate the practical viability of our approach, we developed an efficient algorithm and an implementation in the form of a C# library, and used it to parallelize an interactive game application. Our results show that the parallelized game, while simple and very similar to the original sequential game, achieves satisfactory speedups on a multicore processor.

Sebastian Burckhardt and Daan Leijen, Semantics of Concurrent Revisions, no. MSR-TR-2010-94, 15 July 2010

Enabling applications to execute various tasks in parallel is difficult if those tasks exhibit read and write conflicts. We recently developed a programming model based on concurrent revisions that addresses this challenge in a novel way: each forked task gets a conceptual copy of all the shared state, and state changes are integrated only when tasks are joined, at which time write-write conflicts are deterministically resolved.

In this paper, we study the precise semantics of this model, in particular its guarantees for determinacy and consistency. First, we introduce a revision calculus that concisely captures the programming model. Despite allowing concurrent execution and locally nondeterministic scheduling, we prove that the calculus is confluent and guarantees determinacy. We show that the consistency guarantees of our calculus are a logical extension of snapshot isolation with support for conflict resolution and nesting. Moreover, we discuss how custom merge functions can provide stronger guarantees for particular data types that are tailored to the needs of the application. Finally, we show we can faithfully characterize concurrent computations as revision diagrams which show visually the synchronization between tasks and allow local reasoning about state updates. We prove various properties of revision diagrams, and show in particular that these graphs are semi-lattices.

Ross Tate, Daan Leijen, and Sorin Lerner, A Flexible Semantic Framework for Effects , July 2010

Note: this is an unpublished manuscript.

Effects are a powerful and convenient component of programming. They enable programmers to interact with the user, take advantage of efficient stateful memory, throw exceptions, and nondeterministically execute programs in parallel. However, they also complicate every aspect of reasoning about a program or language, and as a result it is crucially important to have a good understanding of what effects are and how they work. In this paper we present a new framework for formalizing the semantics of effects that is more general and thorough than previous techniques while clarifying many of the important concepts. By returning to the category theoretic roots of monads, our framework is rich enough to describe the semantics of effects for a large class of languages including common imperative and functional languages. It is also capable of capturing more expressive, precise, and practical effect systems than previous approaches. Finally, our framework enables one to reason about effects in a language-independent manner, and so can be applied to many stages of language design and implementation in order to create more broadly applicable tools for programming languages.

Daan Leijen and Ross Tate, Convenient Explicit Effects using Type Inference with Subeffects, no. MSR-TR-2010-80, 22 June 2010

Most programming languages in use today let one freely use arbitrary (side) effects. This is despite the fact that unknown and unrestricted side effects are the cause of many software problems. We propose a programming model where effects are treated in a disciplined way, and where the potential side-effects of a function are apparent in its type signature. In contrast to most effect systems that are meant for internal compiler optimizations, our system is designed to be used by the programmer. Inspired by Haskell, we use a coarse-grained hierarchy of effects, like |Pure| and |IO|, which makes it convenient to read and write type signatures. The type and effect of expressions can also be inferred automatically, and we describe a polymorphic type inference system based on Hindley-Milner style inference.

2009

Daan Leijen, Wolfram Schulte, and Sebastian Burkhardt, The Design of a Task Parallel Library, in Proceeding of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications (OOPSLA'09), ACM SIGPLAN, Atlanta, FL, September 2009

The Task Parallel Library (TPL) is a library for .NET that makes it easy to take advantage of potential parallelism in a program. The library relies heavily on generics and delegate expressions to provide custom control structures expressing structured parallelism such as map-reduce in user programs. The library implementation is built around the notion of a task as a finite CPU-bound computation. To capture the ubiquitous apply-to-all pattern the library also introduces the novel concept of a replicable task. Tasks and replicable tasks are assigned to threads using work stealing techniques, but unlike traditional implementations based on the THE protocol, the library uses a novel data structure called a 'duplicating queue'. A surprising feature of duplicating queues is that they have sequentially inconsistent behavior on architectures with weak memory models, but capture this non-determinism in a benign way by sometimes duplicating elements. TPL ships as part of the Microsoft Parallel Extensions for the .NET framework 4.0, and forms the foundation of Parallel LINQ queries (however, note that the productized TPL library may differ in significant ways from the basic design described in this article).

Daan Leijen, Flexible types: robust type inference for first-class polymorphism, in 36th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL'09), ACM SIGPLAN, Savannah, Georgia, January 2009

We present HML, a type inference system that supports full firstclass polymorphism where few annotations are needed: only function parameters with a polymorphic type need to be annotated. HML is a simplification of MLF where only flexibly quantified types are used. This makes the types easier to work with from a programmers perspective, and simplifies the implementation of the type inference algorithm. Still, HML retains much of the expressiveness of MLF, it is robust with respect to small program transformations, and has a simple specification of the type rules with an effective type inference algorithm that infers principal types.

Extended version available as Microsoft Research technical report MSR-TR-2008-55, Mar 2008 (see link below)

2008

Daan Leijen, HMF: Simple Type Inference for First-Class Polymorphism, in 13th ACM symp. of the International Conference on Functional Programming (ICFP'08), Association for Computing Machinery, Inc., Victoria, BC, Canada, September 2008

HMF is a conservative extension of Hindley-Milner type inference with first-class polymorphism and regular System F types. The system distinguishes itself from other proposals with simple type rules and a very simple type inference algorithm that is just a small extension of the usual Damas-Milner algorithm. Given the relative simplicity and expressive power, we feel that HMF can be a very attractive type system in practice.

Extended version with proofs available as Microsoft Research technical report MSR-TR-2007-118, Sep 2007 (see link below)

Daan Leijen, Flexible types: robust type inference for first-class polymorphism, no. MSR-TR-2008-55, March 2008

2007

Daan Leijen, HMF: Simple Type Inference for First-Class Polymorphism, no. MSR-TR-2007-118, 30 October 2007

Daan Leijen, A Type Directed Translation from MLF to System F, in Proceedings of the ACM International Conference on Functional Programming (ICFP'07), ACM SIGPLAN, Freiburg, Germany, 1 October 2007

The MLF type system by Le Botlan and Rémy (2003) is a natural extension of Hindley-Milner type inference that supports full firstclass polymorphism, where types can be of higher-rank and impredicatively instantiated. Even though MLF is theoretically very attractive, it has not seen widespread adoption. We believe that this partly because it is unclear how the rich language of MLF types relate to standard System F types. In this article we give the first type directed translation of MLF terms to System F terms. Based on insight gained from this translation, we also define “Rigid MLF” (MLF =), a restriction of MLF where all bound values have a System F type. The expressiveness of MLF = is the same as that of boxy types, but MLF = needs fewer annotations and we give a detailed comparison between them.

2005

Daan Leijen and Andres Löh, Qualified types for MLF, in The International Conference on Functional Programming (ICFP'05), ACM SIGPLAN, Tallin, Estonia, September 2005

ML-F is a type system that extends a functional language with impredicative higher-rank polymorphism. Type inference remains possible and only in some clearly defined situations, a local type annotation is required. Qualified types are a general concept that can accomodate a wide range of type systems extension, for example, type classes in Haskell. We show how the theory of qualified types can be used seamlessly with the higher-ranked impredicative polymorphism of ML-F. A core contribution of the paper is that we show a solution to the non-trivial problem of evidence translation in the presence of impredicative data types. (Note: one can experiment with MLF types with the experimental Morrow interpreter. However, this does not yet implement the qualified type extension discussed in the paper)

Daan Leijen, Extensible records with scoped labels, in Proceedings of the 2005 Symposium on Trends in Functional Programming (TFP'05), Tallin, Estonia, September 2005

Records provide a safe and flexible way to construct data structures. We describe a natural approach to typing polymorphic and extensible records that is simple, easy to use in practice, and straightforward to implement. A novel aspect of this work is that records can contain duplicate labels, effectively introducing a form of scoping over the labels. Furthermore, it is a fully orthogonal extension to existing type systems and programming languages. In particular, we show how it can be used conveniently with standard Hindley-Milner, qualified types, and ML-F. The records are implemented in the experimental Morrow interpreter. There is separate technical report that gives the constructive proofs of soundness and completeness.

2004

Daan Leijen, First-class labels for extensible rows, no. UU-CS-2004-51, December 2004

This paper describes a type system for extensible records and variants with first-class labels; labels are polymorphic and can be passed as arguments. This increases the expressiveness of conventional record calculi significantly, and we show how we can encode intersection types, closed-world overloading, type case, label selective calculi, and first-class messages. We formally motivate the need for row equality predicates to express type constraints in the presence of polymorphic labels. This naturally leads to an orthogonal treatment of unrestricted row polymorphism that can be used to express first-class patterns. Based on the theory of qualified types, we present an effective type inference algorithm and efficient compilation method.

Bastiaan Heeren and Daan Leijen, Functioneel Programmeren met Helium, in NIOC'2004, November 2004

Moderne functionele talen zijn mathematisch precies, notationeel elegant, en sterk getypeerd. Deze talen zijn daarom bij uitstek geschikt voor het onderwijzen van programmeerconcepten en algoritmiek. Helium is een onderzoeksproject waarbij een krachtig typesysteem gecombineerd wordt met precieze en gebruiksvriendelijke foutmeldingen

Daan Leijen and Bastiaan Heeren, Gebruikersvriendelijke compiler voor het onderwijs, in Informatie, vol. 46, no. 8, October 2004

Foutmeldingen van compilers zijn vaak moeilijk te interpreteren. Om het programmeeronderwijs te verbeteren wordt aan de Universiteit Utrecht de Helium compiler ontwikkeld die precieze, gebruikersvriendelijke foutmeldingen genereerd.

Daan Leijen, wxHaskell – A portable and concise GUI library for Haskell, in ACM SIGPLAN Haskell Workshop (Haskell'04), Snowbird, Utah, ACM Press, 22 September 2004

wxHaskell is a graphical user interface (GUI) library for Haskell that is built on wxWidgets: a free industrial strength GUI library for C++ that provides a common interface that has been ported to all major platforms, including Windows, Gtk, and MacOS X. In contrast with many other libraries, wxWidgets retains the native lookand-feel of each particular platform. We show how distinctive features of Haskell, like parametric polymorphism, higher-order functions, and first-class computations, can be used to present a concise and elegant monadic interface for portable GUI programs.

2003

Daan Leijen, The λ Abroad – A Functional Approach to Software Components, Dept. of Computer Science, Universiteit Utrecht, November 2003

Bastiaan Heeren, Daan Leijen, and Arjan van IJzendoorn, Helium, for Learning Haskell, in Proceedings of the ACM SIGPLAN Haskell Workshop (Haskell'03), Uppsala, Sweden, ACM SIGPLAN, 28 August 2003

Helium is a user-friendly compiler designed especially for learning the functional programming language Haskell. The quality of the error messages has been the main concern both in the choice of the language features and in the implementation of the compiler. Helium implements almost full Haskell, where the most notable difference is the absence of type classes. Our goal is to let students learn functional programming more quickly and with more fun. The compiler has been successfully employed in two introductory programming courses at Utrecht University.

Daan Leijen, Division and Modulus for Computer Scientists , July 2003

There exist many definitions of the div and mod functions in computer science literature and programming languages. We briefly review the most common definitions (truncated division, Knuth's floored division, etc.) and discuss the rare, but mathematically elegant, Euclidean division. We also give an algorithm for the Euclidean div and mod functions and prove it correct with respect to Euclid's theorem.

2001

Daan Leijen and Erik Meijer, Parsec: Direct Style Monadic Parser Combinators for the Real World, no. UU-CS-2001-27, July 2001

Despite the long list of publications on parser combinators, there does not yet exist a monadic parser combinator library that is applicable in real world situations. In particular naive implementations of parser combinators are likely to suffer from space leaks and are often unable to report precise error messages in case of parse errors. The Parsec parser combinator library described in this paper, utilizes a novel implementation technique for space and time efficient parser combinators that in case of a parse error, report both the position of the error as well as all grammar productions that would have been legal at that point in the input.

1999

Daan Leijen and Erik Meijer, Domain Specific Embedded Compilers, in 2nd USENIX Conference on Domain Specific Languages (DSL'99), Austin, Texas, October 1999

Domain-specific embedded languages (DSELs) expressed in higher-order, typed (HOT) languages provide a composable framework for domain-specific abstractions. Such a framework is of greater utility than a collection of stand-alone domain-specific languages. Usually, embedded domain specific languages are build on top of a set of domain specific primitive functions that are ultimately implemented using some form of foreign function call. We sketch a general design pattern for embedding client-server style services into Haskell using a domain specific embedded compiler for the server's source language. In particular we apply this idea to implement HaskellDB, a domain specific embedded compiler that dynamically generates of SQL queries from monad comprehensions, which are then executed on an arbitrary ODBC database server.

Daan Leijen, Erik Meijer, and James Hook, Haskell as an Automation Controller, in The 3rd International Summerschool on Advanced Functional Programming, Braga, Portugal, Springer-Verlag, Braga, Portugal, September 1999

An introduction to using (automation) COM components from Haskell and how to write hybrid applications with Haskell, J++ and VB. All the examples discussed in the article are included in the HaskellScript release.

Sigbjörn Finne, Daan Leijen, Erik Meijer, and Simon Peyton Jones, Calling Hell from Heaven and Heaven from Hell, in The International Conference on Functional Programming (ICFP'99), Paris, France, ACM Press, Paris, France, September 1999

The increasing popularity of component-based programming tools offer a big opporunity to designers of advanced programming languages, such as Haskell. If we can package our programs as COM objects, then it is easy to integrate them into applications written in other languages. In earlier work we described a preliminary integration of Haskell with Microsoft's Component Object Model (COM), focusing on how Haskell can create and invoke COM objects. This paper develops that work, concentrating on the mechanisms that support externally-callable Haskell functions, and the encapsulation of a Haskell program as a COM object.

1998

Sigbjörn Finne, Daan Leijen, Erik Meijer, and Simon Peyton Jones, H/Direct: A Binary Foreign Language Interface to Haskell, in Proceedings of the ACM SIGPLAN International Conference on Functional Programming (ICFP'98), ACM SIGPLAN, Baltimore, USA, September 1998

H/Direct is a foreign-language interface for the purely functional language Haskell. Rather than rely on host-language type signatures, H/Direct compiles Interface Definition Language (IDL) to Haskell stub code that marshals data across the interface. This approach allows Haskell to call both C and COM, and allows a Haskell component to be wrapped in a C or COM interface. IDL is a complex language and in contrast, we provide a relatively formal and precise definition of the mapping between Haskell and IDL.

Daan Leijen, Functional Components: COM components in Haskell, University of Amsterdam, June 1998

Simon Peyton Jones, Erik Meijer, and Daan Leijen, Scripting COM components from Haskell, in Fifth International Conference on Software Reuse (ICSR'98), Victoria, BC, Canada, IEEE Computer Society Press, Victoria, B.C., Canada, June 1998

The expresiveness of higher-order, typed languages such as Haskell or ML makes them an attractive medium in which to write software components. Hitherto, however, their use has been limited by the all-or-nothing problem: it is hardto write just part of an application in these languages. Component-based programming using a binary standard such as Microsofts's Component Object Model (COM) offers a solution to this dilemma, by specifying a language-independent interface between components. This paper reports about our experience with exploiting this opportunity in the purely funtional language Haskell. We describe a design for integrating COM components into Haskell programs, and we demonstrate why someone might want to script their COM components this way.

Erik Meijer, Daan Leijen, and James Hook, Client-side Web Scripting with HaskellScript, in Practical Aspects of Declarative Languages (PADL'98), Springer-Verlag, Berlin, 1998

Using client-side scripting it is possible to build interactive web pages that don't need round-trips to the server for every user-event. The web browser exposes itself to the script via an object model (DOM), which means that scripts can add and remove page content, or change the position and color of elements via their style attributes. We explain the object model as implemented by Microsoft Internet Explorer by means of examples and report on our experiences of using Haskell as a programming language for client-side web scripting using the HaskellScript scripting engine.