The Programming Language Mondrian

In the preface of the Haskell report the hope is expressed that extensions or variants of the language appear that incorporate experimental features. The scripting language Mondrian is such an experiment. Mondrian evolves from Haskell by deleting some of Haskell's more complicated constructs and adding a few simple new ones.

The current version of Mondrian is just the core subset of Haskell. The plan is to let the language grow by demand and experimentation. The first extension we are planning to add are implicit arguments, as are already available as an extension to Hugs98.

At the implementation level an interesting aspect of Mondrian the fact that we use Java as the target language, borrowing as much as possible from Java. The only aspect of Mondrian that we make explicit in the translation to Java is call-by-name and sharing. In fact, one one can say that Mondrian is a domain specific language embedded in Java!

The Mondrian collective currently consists of Erik Meijer and Arjan van IJzendoorn. You are invited to join us!

The Mondrian Runtime

Following the Open Source Software adagio "release fast, release often", we make the Mondrian run-time system available, even though we are still working on our frontend. The target Java is so close to the Mondrian source that you can easily write the Java by hand.

For example, the target code for the function I = \x -> x checks whether there is at least one argument on the stack, pops it from the stack and ENTER()s it:

  class I implements Code
  {
   public Object ENTER()
   {
    VM.COLLECT(1,this);
    final Code a = VM.POP();
    return a.ENTER();
   }
  }
Unfortunately, Java does not support tail calls, so we adopt the trampoline trick where we return the code that we are about to enter, and leave it to a top-level "interpreter" loop to enter that code. Hence, the tail-call avoiding version of the identity function is:
  class I implements Code
  {
   public Object ENTER()
   {
    VM.COLLECT(1,this);
    final Code a = VM.POP();
    return a;
   }
  }
The toplevel interpreter is implemented by the method WHNF.

The K = \x y -> x combinator is straightforward to compile as well. Take two arguments from the stack and enter the first:

  class K implements Code
  {
   public Object ENTER()
   {
    VM.COLLECT(2,this);
    final Code x = VM.POP();
    final Code y = VM.POP();
    return x;
   }
  }

Things get a little more interesting when we get complex arguments as in S = \f g x -> f x (g x). In this case we have to construct a new updatable piece of code using the new Thunk(...) constructor, passing it an anonymous innerclass new Code(){...} that contains the code for the argument expression:

  class S implements Code
  {
   public Object ENTER()
   {
    VM.COLLECT(3,this);
    final Code f = VM.POP();
    final Code g = VM.POP();
    final Code x = VM.POP();
    VM.PUSH(new Thunk(new Code(){
        public Object ENTER()
        {
         VM.PUSH(x);
         return g;
        }
    }));
    VM.PUSH(x);
    return f;
   }
  }

A Simple Translation Scheme

The translation below is a slight variation on the translation described in the unfinished draft Down With Lambda-Lifting.

All the translation schemes below implement the Code interface that represents a code pointer:

  interface Code
  {
    Object ENTER();
  }

A Thunk is a special kind of code pointer that can be updated. If a Thunk is ENTER()-ed for the first time, it installs an update marker to update itself with the result of evaluating its internal code pointer.

  class Thunk implements Code
  {
    public Code code;
    private boolean virgin = true;

    Thunk(Code code){ this.code = code; }

    public Object ENTER()
    {
      if (virgin){
        VM.INSTALL(this);
        virgin = false;
      }
      return this.code;
    }

    void update(Code code){ this.code = code; }
  }
  

Definitions

A top-level definition D[f = \x0 ... xn-1 -> e] is translated into a class that implements the interface Code

  class f implements Code
  {
    public Object ENTER()
    {
     VM.COLLECT(n,this);
     final Code x0 = VM.POP();
     ...
     final Code xn-1 = VM.POP();
     E[e]
    }
  }

Expressions

The translation E[f a] of an application expression pushes the code for argument a on the stack:

  A[a]
  E[f]
The translation E[xi] of an atomic expression where xi is an argument returns the code to the top-level loop:
  return xi;
in case of a defined function E[g] it returns a new instance of the code of that function to the top-level interpreter:
  return new g();
Note that under this simple translation scheme CAFs are not shared.

Arguments

Atomic arguments are easy, a reference to a defined function A[g] returns a new instance of the corresponding class

  VM.PUSH(new g());
while a reference to an argument A[xi] returns that argument
  VM.PUSH(xi);
For complex arguments A[e] where e is not an argument or function, we have to build a new thunk
  VM.PUSH(new Thunk(new Code(){
    public Object ENTER()
    {
     E[e]
    }
  }));

Case expressions and algebraic datatypes

For concreteness, we restrict ourselves to the Haskell data type of lists

  data List a = Nil | Cons a (List a)
is translated into three corresponding Java classes
  abstract class List{}
  class Nil extends List{}
  class Cons extends List
  {
    Code head;
    Code tail;

    Cons(Code head, Code tail)
    {
      this.head = head; this.tail = tail;
    }
  }
We however do not use the constructors for Nil and Cons to build lists but the following wrapper objects that implement the Code interface
  class cons implements Code
  {
    public Object ENTER()
    {
      VM.COLLECT(2, this);
      final Code head = VM.POP();
      final Code tail = VM.POP();
      return (new Cons(head, tail));
    }
  }

  class nil implements Code
  {
    public Object ENTER()
    {
     return new Nil();
    }
  }

Case expressions E[case e1 of { Nil -> e2; Cons head tail -> e2 }] scrutinize the expression e1 into WHNF and perform a switch on the result, continuing by evaluating either e1 or e2.

  final List x = VM.WHNF(A[e1]);
  if (x instanceof Nil)
  {
    E[e1]
  }
  else
  {
    final Code head = ((Cons)x).head;
    final Code tail = ((Cons)x).tail;
    E[e2]
  }

To show how close the Java code is to the original Mondrian source, we show the translation of foldr:

  /**
  *   foldr = \op e xs ->
  *     case xs of
  *       { Nil       -> e
  *       ; Cons y ys -> op y (foldr op e ys)
  *       }
  */
  class foldr implements Code
  {
    public Object ENTER()
    {
      VM.COLLECT(3, this);
      final Code op = VM.POP();
      final Code e  = VM.POP();
      final Code xs = VM.POP();
      final List xs_ = (List)VM.WHNF(xs);
      if (xs_ instanceof Nil)
      {
        return e;
      }
      else
      { final Code y = ((Cons)xs_).head;
        final Code ys = ((Cons)xs_).tail;
        VM.PUSH(new Thunk (new Code(){
          public Object ENTER()
          {
            VM.PUSH(ys);
            VM.PUSH(e);
            VM.PUSH(op);
            return new foldr();
          }
      }));
      VM.PUSH(y);
      return op;
    }
    }
  }

Integers and boxing

Mondrian uses Java's Integer class as its representation of boxed integers (and similarly for all other primitive types with their corresponding wrapped reference types). To construct closure for a boxed integer, we use the IntLit(int) constructor of the IntLit class.

  class IntLit implements Code
  {
    public Integer value;

    public Object ENTER()
    {
      return this.value;
    }

    IntLit(int value)
    {
      this.value = new Integer(value);
    }
  }
So the translation E[n] of an integer literal is return new IntLit(n);. When passing an integer literal as an argument A[n] we get PUSH(new IntLit(n));.

Pattern matching E[case e1 of { ...; i -> ei; ...; n -> en}] (where n is the default case) against an integer is a two step process. First we bring it to WHNF, then we inspect its intValue():

  x = ((Integer)VM.WHNF(e1)).intValue();
  switch (x)
  {
    case i: E[ei]
    default: final Code n = new IntLit(x);
             E[n]
  }

As an example of translating pattern matching on integers, we compile the ubiquitous factorial function fac = \n -> case n of { 0 -> 1; m -> m * fac (m-1) } into the following class.

  class fac implements Code
  {
    public Object ENTER()
    {
      VM.COLLECT(1, this);
      final Code n = VM.POP();
      int x = ((Integer)(VM.WHNF(n)).intValue();
      switch (x)
      {
       case 0: return new IntLit(1);
      default: final Code m = new IntLit(x);
               VM.PUSH(new Thunk(new Code(){
                  public Object ENTER()
                  {
                   VM.PUSH(new Thunk (new Code(){
                        public Object ENTER()
                        {
                          VM.PUSH(new IntLit(1));
                          VM.PUSH(m);
                          return new minus();
                        }
                   }));
                   return new fac();
                   }
                }));
                VM.PUSH(m);
                return new times();
            }
          }
      }

connoisseur of implementing functional languages will notice immediately that the techniques of dealing with unboxed values from Peyton Jones and Launchbury are applicable here as well. Instead of encoding sequencing using case, we use Java's "semicolon". So, our definition of the primitive minus function is

  class minus implements Code
  {
    public Code ENTER()
    {
      VM.COLLECT(2,this)
      final Code a = VM.POP();
      final Code b = VM.POP();
      int x = ((Integer)(VM.WHNF(a))).intValue();
      int y = ((Integer)(VM.WHNF(b))).intValue();
      return new Integer(x+y);
    }
  }
Consequently, instead of having laws about case expressions in the source language, we get laws about sequencing, WHNF, and switch in the target language. Here are a few laws
  X x = E; ...; X y = E; ... ===> X x = E; ...; X y = x; ...
  switch (x){ default: E }   ===> E
Of course, we can easily do case-based transformations on the source language and translate them into Java afterwards by translating E[case x of Int x# -> e] into int x = ((Integer)(VM.WHNF(a))).intValue(); E[e].

Obtaining the sources

This pre-release of the Mondrian system consists of the following Java sources

Runtime.java
The Mondrian abstract machine VM class
Examples.java
Some simple examples
Monad.java
Classes for dealing with the IO monad
List.java
Classes for the List data type
Bool.java
Classes for booleans
Integer.java
Classes for integer arithmetic
Unit.java
Classes for the () type
Sieve.java
Demo program for generating the infinite list of prime numbers
Prelude.java
Simple prelude from Implementing functional languages: a tutorial by Peyton Jones and Lester
Core.java
Example programs from Peyton Jones and Lester


Last modified on by Erik Meijer.