The Koka Book
An Introduction and Specification of the Koka language.

🔗1. Getting started

Welcome to the Koka book. This provides an overview and formal specification of the language. For more background information, see:

  • The library documentation.
  • The Koka research page and the slides of a talk presented Lang.Next (April 2012).
  • The source code of the Koka compiler.
  • The article Algebraic Effects for Functional Programming [3] about the algebraic effects in Koka.
  • An article about the type system and semantics of Koka [2].

🔗1.1. Installing the compiler

At this point there are no binary releases of Koka and you need to build the compiler yourself. Fortunately, Koka has few dependencies and builds without problems on most common platforms, e.g. Windows, MacOSX, and Unix.

The following programs are required to build Koka:

All these programs are very easy to install on most platforms. Now we can build Koka itself:

  1. First clone the Koka sources with algebraic effects support:

    > hg clone https://hg.codeplex.com/koka -b algeff

    You can also use -b algeff-dev to get the latest development version.

  2. Go to the newly created Koka directory:

    > cd koka
  3. Install any needed Node libraries using the Node package manager:

    > npm install

    If you are running on MacOSX or Unix, you may have to run this as sudo npm install so that the npm package manager has enough permissions to install the jake and madoko tools.

  4. Finally, build the compiler and run the Koka interactive environment:

    > jake

    You can type jake help to see an overview of all make targets.

The excellent Sublime text editor is recommended to edit Koka programs. You can install support for Koka programs using

> jake sublime

After this .kk files will be properly highlighted. It is also recommended to use the newly installed snow color theme which is designed to work well with Koka files.

🔗1.2. Running the interactive compiler

After running a plain jake command, the Koka interactive environment will start:

__          _
| |        | |
| | __ ___ | | __ __ _
| |/ // _ \| |/ // _` | welcome to the koka interpreter
|   <| (_) |   <| (_| | version 0.7.0-dev (debug), Jun 30 2016
|_|\_\\___/|_|\_\\__,_| type :? for help

loading: std/core

Now you can test some expressions:

> println("hi koka")
hi koka

> :t "hi"
stringstd/core/string: V

> :t println("hi")
consolestd/core/console: X ()

Or load a demo:

> :l demo/collatz
compile: lib/demo/collatz.kk
check  : demo/collatz
modules:
  demo/collatz

> main()
Collatz(27) took 111 steps.

And quit the interpreter:

> :q

Before the effect one believes in different causes than one does after the effect.
 -- Friedrich Nietzsche

You can also run examples in the browser by setting the host:

> :set --host=browser
> 1+2

Some browser specific demo to try is for example demo/dom/conway.kk.

🔗1.3. Algebraic effect handlers

A novel feature of Koka is a compiled and typed implementation of algebraic effect handlers (described in detail in [3]). In the interactive environment, you can load various demo files with algebraic effects which are located in the test/algeff directory. This is by default included in the search path, so we can load them directly using the load (:l) command:

> :l scoped

Use the :? command to get an overview of all commands. After loading the scoped demo, we can run it directly from the interpreter:

> :l scoped
compile: test/algeff/scoped.kk
check  : scoped
modules:
  scoped

> main()
[[3],[2,1],[1,2],[1,1,1]]
(state=12, [[3],[2,1],[1,2],[1,1,1]])
[(state=1, [3]),(state=5, [2,1]),(state=5, [1,2]),(state=9, [1,1,1])]
[[3]]
[42]

Some interesting demos are:

  • common.kk: Various examples from the paper “Algebraic Effects for Functional Programming” [3]. Shows how to implement common control-flow abstractions like exceptions, state, iterators, ambiguity, and asynchronous programming.

  • scoped.kk: Various examples from the paper “Effect handlers in Scope” [4].

  • nim.kk: Various examples from the paper “Liberating effects with rows and handlers” [1].

  • async*.kk: Various asynchronous effect examples.

  • parser.kk: Implements parser combinators as an effect.

🔗1.4. A primer on effect handlers

Another small demo is effs2 that demonstrates the ambiguity and state effect:

> :l effs2
compile: test/algeff/effs2.kk
check  : effs2
modules:
  effs1

> main()
[Falsestd/core/False: bool,Truestd/core/True: bool,Truestd/core/True: bool,Falsestd/core/False: bool]
[Falsestd/core/False: bool,Falsestd/core/False: bool,Truestd/core/True: bool,Truestd/core/True: bool,Falsestd/core/False: bool]
[Falsestd/core/False: bool,Falsestd/core/False: bool]

The effs2.kk module starts by defining the ambgetstarted/amb: HX effect:

effect amb {
  fun flip() : bool
}
effect: .ops-amb<14> ambgetstarted/amb: HX {
  fun flipgetstarted/flip: () -> amb bool() : boolstd/core/bool: V
}

This declares ambgetstarted/amb: forall<a,e> (action : () -> <amb|e> a) -> e list<a> as a new effect with a single operation flipgetstarted/flip: () -> amb bool. We can use this as:

fun xor() : amb bool {
  val p = flip()
  val q = flip()
  (p||q) && not(p&&q)
}
fun xorgetstarted/xor: () -> amb bool() : ambgetstarted/amb: HX boolstd/core/bool: V {
  val pp: bool = flipgetstarted/flip: () -> amb bool()
  val qq: bool = flipgetstarted/flip: () -> amb bool()
  (pp: bool||std/core/(||): (bool, bool) -> boolqq: bool) &&std/core/(&&): (bool, bool) -> bool notstd/core/not: (bool) -> bool(pp: bool&&std/core/(&&): (bool, bool) -> boolqq: bool)
}

where the type of xorgetstarted/xor: () -> amb bool reflects that it has a ambgetstarted/amb: forall<a,e> (action : () -> <amb|e> a) -> e list<a> effect and returns a boolean.

Next, let's write a handler for this effect:

val amb = handler {
  return x -> [x]
  flip()   -> resume(False) + resume(True)
}
val ambgetstarted/amb: forall<a,e> (action : () -> <amb|e> a) -> e list<a> = handlerhandler: forall<a,e> (action : () -> <amb|e> a) -> e list<a> {
  returnreturn: (x : _101) -> <cps|_110> list<_101> xx: _101 -> [std/core/Cons: forall<a> (head : a, tail : list<a>) -> list<a>xx: _101]std/core/Nil: forall<a> list<a>
  flipgetstarted/flip: () -> <cps|_110> list<_101>()   -> resumeresume: (x : bool) -> <cps|_110> list<_101>(Falsestd/core/False: bool) +std/core/(+).3: forall<a> (xs : list<a>, ys : list<a>) -> list<a> resumeresume: (x : bool) -> <cps|_110> list<_101>(Truestd/core/True: bool)
}

When a flipgetstarted/flip: () -> amb bool operation is issued, this handler will catch it dynamically. In the above handler, we resume twice: once with a Falsestd/core/False: bool value as the result for flipgetstarted/flip: () -> amb bool, and once with a Truestd/core/True: bool value. The return clause wraps the final result in a list which are concatenated by the flipgetstarted/flip: () -> amb bool handler. The type of the ambgetstarted/amb: forall<a,e> (action : () -> <amb|e> a) -> e list<a> handler is a function that removes the ambgetstarted/amb: forall<a,e> (action : () -> <amb|e> a) -> e list<a> effect from its argument, and return a list of results:

> :t amb
forall<a,e> (action : () -> <ambgetstarted/amb: HX|e> a) -> e liststd/core/list: V -> V<a>

We can now run the xorgetstarted/xor: () -> amb bool function using the ambgetstarted/amb: forall<a,e> (action : () -> <amb|e> a) -> e list<a> handler to handle the flipgetstarted/flip: () -> amb bool operations:

fun test1() {
  amb(xor).show.println
}
fun test1getstarted/test1: () -> console ()() {
  ambgetstarted/amb: forall<a,e> (action : () -> <amb|e> a) -> e list<a>(xorgetstarted/xor: () -> amb bool).showstd/core/show.10: (xs : list<bool>) -> string.printlnstd/core/println: (s : string) -> console ()
}

Here we used dot syntax introduced in Section 2.2. If we run test1getstarted/test1: () -> console () in the interpreter, we see all possible results:

> test1()
[Falsestd/core/False: bool,Truestd/core/True: bool,Truestd/core/True: bool,Falsestd/core/False: bool]


Adding state
Let's combine the ambiguity effect with state. The definition of the state effect is polymorphic in its value:

effect state<s> {
  fun get()    : s
  fun set(i:s) : ()
}
effect: .ops-state<19,18> stategetstarted/state: V -> HX<ss: V> {
  fun getgetstarted/get: forall<a> () -> (state<a>) a()    : ss: V
  fun setgetstarted/set: forall<a> (i : a) -> (state<a>) ()(i: 7:ss: V) : (std/core/(): V)std/core/(): V
}

Next we define a function that uses both ambiguity and the state effect:

fun foo() : <amb,state<int>> bool {
  val p = flip() 
  val i = get()
  set(i+1)
  if (i>0 && p) then xor() else False
}
fun foogetstarted/foo: () -> <amb,state<int>> bool() : <std/core/(<>): Eambgetstarted/amb: HX,stategetstarted/state: V -> HX<intstd/core/int: V>> boolstd/core/bool: V {
  val pp: bool = flipgetstarted/flip: () -> amb bool() 
  val ii: int = getgetstarted/get: forall<a> () -> (state<a>) a()
  setgetstarted/set: forall<a> (i : a) -> (state<a>) ()(ii: int+std/core/(+): (int, int) -> int1)
  if (istd/core/True: bool>std/core/(>).1: (int, int) -> bool0 &&std/core/(&&): (bool, bool) -> bool pp: bool) then xorgetstarted/xor: () -> amb bool() else Falsestd/core/False: bool
}

The handler for the stategetstarted/state: V -> HX effect takes a local parameter that is propagated through the resume function.

val state = handler(i) {
  return x -> x
  get()    -> resume(i,i)
  set(j)   -> resume(j,())
}
val stategetstarted/state: forall<a,b,e> (i : a, action : () -> <state<a>|e> b) -> e b = handlerhandler: forall<a,b,e> (i : a, action : () -> <state<a>|e> b) -> e b(ii: _315) {
  returnreturn: (i : _315, x : _316) -> <cps|_320> _316 xx: _316 -> xx: _316
  getgetstarted/get: () -> <cps|_320> _316()    -> resumeresume: (i : _315, x : _315) -> <cps|_320> _316(ii: _315,ii: _315)
  setgetstarted/set: (_315) -> <cps|_320> _316(j)   -> resumeresume: (i : _315, x : ()) -> <cps|_320> _316(jj: _315,(std/core/(): ())std/core/(): ())
}

Type of the stategetstarted/state: forall<a,b,e> (i : a, action : () -> <state<a>|e> b) -> e b handler takes an initial state as an extra argument:

> :t state
forall<a,b,e>. () -> ((i : a, action : () -> <stategetstarted/state: V -> HX<a>|e> b) -> e b)

We can now combine the ambiguity handler with the state handler in two ways (see Section 2.4 for a primer on anonymous function syntax):

fun test2()  {
  state(0){ amb(foo) }
}

fun test3()  {
  amb{ state(0,foo) }
}
fun test2getstarted/test2: () -> list<bool>()  {
  stategetstarted/state: forall<a,b,e> (i : a, action : () -> <state<a>|e> b) -> e b(0){ ambgetstarted/amb: forall<a,e> (action : () -> <amb|e> a) -> e list<a>(foogetstarted/foo: () -> <amb,state<int>> bool) }
}

fun test3getstarted/test3: () -> list<bool>()  {
  ambgetstarted/amb: forall<a,e> (action : () -> <amb|e> a) -> e list<a>{ stategetstarted/state: forall<a,b,e> (i : a, action : () -> <state<a>|e> b) -> e b(0,foogetstarted/foo: () -> <amb,state<int>> bool) }
}

In test2getstarted/test2: () -> list<bool> the state handler is outside and every ambiguity execution modifies the same global state. In test3getstarted/test3: () -> list<bool> the state handler is inside and every ambiguity execution has its own local state instead. Can you predict the outcomes of running the tests?

> test2()
[Falsestd/core/False: bool,Falsestd/core/False: bool,Truestd/core/True: bool,Truestd/core/True: bool,Falsestd/core/False: bool]

> test3()
[Falsestd/core/False: bool,Falsestd/core/False: bool]

🔗2. An overview of Koka

This is a short introduction to the Koka programming language meant for programmers familiar with languages like C++, C#, or JavaScript.

Koka is a function-oriented language that separates pure values from side-effecting computations (The word ‘koka’ (or 効果) means “effect” or “effective” in Japanese). Koka is also flexible and fun: Koka has many features that help programmers to easily change their data types and code organization correctly even in large-scale programs, while having a small strongly-typed language core with a familiar JavaScript like syntax.

🔗2.1. Hello world

As usual, we start with the familiar Hello world program:

fun main() {
  println("Hello world!") // println output
}
fun mainoverview/main: () -> console ()() {
  printlnstd/core/println: (s : string) -> console ()("Hello world!") // println output
}

Koka uses familiar curly-braces syntax where starts a line comment. Functions are declared using the fun keyword.

If you are reading this on Rise4Fun, you can click the load in editor button in the upper right corner of the example to load it into the editor and run the program.

Here is another short example program that encodes a string using the Caesar cipher, where each lower-case letter in a string is replaced by the letter three places up in the alphabet:

fun main() { println(caesar("koka is fun")) }

fun encode( s : string, shift : int )
{
  fun encode-char(c) {
    if (c < 'a' || c > 'z') return c
    val base = (c - 'a').int
    val rot  = (base + shift) % 26
    (rot.char + 'a')
  }
  s.map(encode-char)
}

fun caesar( s : string ) : string {
  s.encode( 3 )
}
fun encodeoverview/encode: (s : string, shift : int) -> string( ss: string : stringstd/core/string: V, shiftshift: int : intstd/core/int: V )
{
  fun encode-charencode-char: (c : char) -> char(cc: char) {
    if (cstd/core/True: bool <std/core/(<): (char, char) -> bool 'a' ||std/core/(||): (bool, bool) -> bool cc: char >std/core/(>): (char, char) -> bool 'z') returnreturn: char cc: char
    val basebase: int = (cc: char -std/core/(-).2: (c : char, d : char) -> total char 'a').intstd/core/int: (char) -> int
    val rotrot: int  = (basebase: int +std/core/(+): (int, int) -> int shiftshift: int) %std/core/(%).1: (x : int, y : int) -> int 26
    (rotrot: int.charstd/core/char: (int) -> char +std/core/(+).4: (c : char, d : char) -> total char 'a')
  }
  ss: string.mapstd/core/map.5: forall<e> (s : string, f : (char) -> e char) -> e string(encode-charencode-char: (c : char) -> char)
}

fun caesaroverview/caesar: (s : string) -> string( ss: string : stringstd/core/string: V ) : stringstd/core/string: V {
  ss: string.encodeoverview/encode: (s : string, shift : int) -> string( 3 )
}

In this example, we declare a local function encode-char which encodes a single character c. The final statement s.map(encode-char) applies the encode-char function to each character in the string s, returning a new string where each character is Caesar encoded. The result of the final statement in a function is also the return value of that function, and you can generally leave out an explicit return keyword. Similarly, Koka's grammar is constructed in such a way that no semi-colons are needed to separate statements.

🔗2.2. Dot selection

Koka is a function-oriented language where functions and data form the core of the language (in contrast to objects for example). In particular, the expression s.encode(3) does not select the encodeoverview/encode: (s : string, shift : int) -> string method from the stringstd/core/string: V object, but it is simply syntactic sugar for the function call encode(s,3) where s becomes the first argument. Similarly, c.int converts a character to an integer by calling int(c) (and both expressions are equivalent). The dot notation is intuïtive and quite convenient to chain multiple calls together, as in:

fun showit( s : string ) -> s.encode(3).count.println
fun showitoverview/showit: (s : string) -> console ()( ss: string : stringstd/core/string: V ) -> ss: string.encodeoverview/encode: (s : string, shift : int) -> string(3).countstd/core/count.1: (s : string) -> int.printlnstd/core/println.1: (i : int) -> console ()

for example (where the body desugars as println(length(encode(s,3)))). An advantage of the dot notation as syntactic sugar for function calls is that it is easy to extend the ‘primitive’ methods of any data type: just write a new function that takes that type as its first argument. In most object-oriented languages one would need to add that method to the class definition itself which is not always possible if such class came as a library for example.

🔗2.3. Types

Koka is also strongly typed. It uses a powerful type inference engine to infer most types, and types generally do not get in the way. In particular, you can always leave out the types of any local variables. This is the case for example for base and rot values in the previous example; hover with the mouse over the example to see the types that were inferred by Koka. Generally, it is good practice though to write type annotations for function parameters and the function result since it both helps with type inference, and it provides useful documentation with better feedback from the compiler.

For the encodeoverview/encode: (s : string, shift : int) -> string function it is actually essential to give the type of the s parameter: since the map function is defined for both liststd/core/list: V -> V and stringstd/core/string: V types and the program is ambiguous without an annotation. Try to load the example in the editor and remove the annotation to see what error Koka produces.

🔗2.4. Anonymous functions

Koka also allows for anonymous function expressions. For example, instead of declaring the encode-char function, we could just have passed it directly to the map function as a function expression:

fun encode2( s : string, shift : int )
{
  s.map( fun(c) {
    if (c < 'a' || c > 'z') return c
    val base = (c - 'a').int
    val rot  = (base + shift) % 26
    (rot.char + 'a')
  });
}
fun encode2overview/encode2: (s : string, shift : int) -> string( ss: string : stringstd/core/string: V, shiftshift: int : intstd/core/int: V )
{
  ss: string.mapstd/core/map.5: forall<e> (s : string, f : (char) -> e char) -> e string( fun(cc: char) {
    if (cstd/core/True: bool <std/core/(<): (char, char) -> bool 'a' ||std/core/(||): (bool, bool) -> bool cc: char >std/core/(>): (char, char) -> bool 'z') returnreturn: char cc: char
    val basebase: int = (cc: char -std/core/(-).2: (c : char, d : char) -> total char 'a').intstd/core/int: (char) -> int
    val rotrot: int  = (basebase: int +std/core/(+): (int, int) -> int shiftshift: int) %std/core/(%).1: (x : int, y : int) -> int 26
    (rotrot: int.charstd/core/char: (int) -> char +std/core/(+).4: (c : char, d : char) -> total char 'a')
  });
}

It is a bit annoying we had to put the final right-parenthesis after the last brace. As a convenience, Koka allows anonymous functions to follow the function call instead. For example, here is how we can print the numbers 1 to 10:

fun main() { print10() }

fun print10() 
{
  for(1,10) fun(i) {
    println(i)
  }
}
fun print10overview/print10: () -> console ()() 
{
  forstd/core/for: forall<e> (start : int, end : int, action : (int) -> e ()) -> e ()(1,10) fun(ii: int) {
    printlnstd/core/println.1: (i : int) -> console ()(ii: int)
  }
}

which is desugared to for( 1, 10, fun(i){ println(i) } ). In fact, since we pass the i argument directly to println, we can also the function itself directly, and write for(1,10,println).

Anonymous functions without any arguments can be shortened further by leaving out the fun keyword and just using braces directly. Here is an example using the repeat function:

fun main() { printhi10() }

fun printhi10()
{
  repeat(10) {
    println("hi")
  }
}
fun printhi10overview/printhi10: () -> console ()()
{
  repeatstd/core/repeat.1: forall<e> (n : int, action : () -> e ()) -> e ()(10) {
    printlnstd/core/println: (s : string) -> console ()("hi")
  }
}

where the body desugars to repeat( 10, fun(){println(hi)} ). The is especially convenient for the whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () loop since this is not a built-in operator in Koka but just a regular function:

fun main() { print11() }

fun print11() {
  var i := 10
  while { i >= 0 } { 
    println(i)
    i := i - 1 
  }
}
fun print11overview/print11: () -> <div,console> ()() {
  var ii: ref<_2037,int> := 10
  whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () { ii: ref<_2037,int> >=std/core/(>=).1: (int, int) -> bool 0 } { 
    printlnstd/core/println.1: (i : int) -> console ()(ii: ref<_2037,int>)
    ii: ref<_2037,int> :=std/core/set: forall<h,a> (ref : ref<h,a>, assigned : a) -> (write<h>) () ii: ref<_2037,int> -std/core/(-): (int, int) -> int 1 
  }
}

Note how the first argument to whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () is in braces instead of the usual parenthesis: Koka makes it always explicit whether code is evaluated before a function is called (in between parenthesis), or whether code is evaluated (potentially multiple times) by the called function instead (in between braces).

🔗2.5. Effect types

A novel part about Koka is that it automatically infers all the side effects that occur in a function. The absence of any effect is denoted as totalstd/core/total: E (or <>) and corresponds to pure mathematical functions. If a function can raise an exception the effect is exnstd/core/exn: X, and if a function may not terminate the effect is divstd/core/div: X (for divergence). The combination of exnstd/core/exn: X and divstd/core/div: X is purestd/core/pure: E and corresponds directly to Haskell's notion of purity. Non- deterministic functions get the ndetstd/core/ndet: X effect. The ‘worst’ effect is iostd/core/io: E and means that a program can raise exceptions, not terminate, be non- deterministic, read and write to the heap, and do any input/output operations. Here are some examples of effectful functions:

fun square1( x : int ) : total int
{
  return x*x
}

fun square2( x : int ) : io int
{
  println( "a not so secret side-effect" )
  return x*x
}

fun square3( x : int ) : div int
{
  square3( x )
  return x*x
}

fun square4( x : int ) : exn int
{
  error( "oops" )
  return x*x
}
fun square1overview/square1: (x : int) -> total int( xx: int : intstd/core/int: V ) : totalstd/core/total: E intstd/core/int: V
{
  returnreturn: int xx: int*std/core/(*): (int, int) -> intxx: int
}

fun square2overview/square2: (x : int) -> io int( xx: int : intstd/core/int: V ) : iostd/core/io: E intstd/core/int: V
{
  printlnstd/core/println: (s : string) -> console ()( "a not so secret side-effect" )
  returnreturn: int xx: int*std/core/(*): (int, int) -> intxx: int
}

fun square3overview/square3: (x : int) -> div int( xx: int : intstd/core/int: V ) : divstd/core/div: X intstd/core/int: V
{
  square3overview/square3: (x : int) -> div int( xx: int )
  returnreturn: int xx: int*std/core/(*): (int, int) -> intxx: int
}

fun square4overview/square4: (x : int) -> exn int( xx: int : intstd/core/int: V ) : exnstd/core/exn: X intstd/core/int: V
{
  errorstd/core/error: forall<a> (string) -> exn a( "oops" )
  returnreturn: int xx: int*std/core/(*): (int, int) -> intxx: int
}

When the effect is totalstd/core/total: E we usually leave it out in the type annotation. For example, when we write:

fun square5( x : int ) : int { x*x }
fun square5overview/square5: (x : int) -> int( xx: int : intstd/core/int: V ) : intstd/core/int: V { xx: int*std/core/(*): (int, int) -> intxx: int }

Then the assumed effect is totalstd/core/total: E. Sometimes, we write an effectful function, but are not interested in explicitly writing down its effect type. In that case, we can use a wildcard type which stands for some inferred type. A wildcard type is denoted by writing an identifier prefixed with an underscore, or even just an underscore by itself:

fun square6( x : int ) : _e int
{
  println("I did not want to write down the io effect")
  return x*x
}
fun square6overview/square6: (x : int) -> console int( xx: int : intstd/core/int: V ) : _e_e: E intstd/core/int: V
{
  printlnstd/core/println: (s : string) -> console ()("I did not want to write down the io effect")
  returnreturn: int xx: int*std/core/(*): (int, int) -> intxx: int
}

Hover over square6overview/square6: (x : int) -> console int to see the inferred effect for _e

🔗2.6. Semantics of effects

The inferred effects are not just considered as some extra type information on functions. On the contrary, through the inference of effects, Koka has a very strong connection to its denotational semantics. In particular, the full type of a Koka functions corresponds directly to the type signature of the mathematical function that describes its denotational semantics. For example, using 〚t〛 to translate a type t into its corresponding mathematical type signature, we have:

intstd/core/int: V -> totalstd/core/total: E intstd/core/int: V =   32 → ℤ32
intstd/core/int: V -> exnstd/core/exn: X intstd/core/int: V = 32 → (1 + ℤ32)
intstd/core/int: V -> purestd/core/pure: E intstd/core/int: V = 32 → (1 + ℤ32)
intstd/core/int: V -> <ststd/core/st: H -> E<h>,purestd/core/pure: E> intstd/core/int: V = (Heap × ℤ32) → (Heap × (1 + ℤ32))

In the above translation, we use 1 + t as a sum where we have either a unit 1 (i.e. exception) or a type t, and we use Heap × t for a product consisting of a pair of a heap and a type t. From the above correspondence, we can immediately see that a totalstd/core/total: E function is truly total in the mathematical sense, while a stateful function (ststd/core/st: H -> E<h>) that can raise exceptions or not terminate (purestd/core/pure: E) takes an implicit heap parameter, and either does not terminate (⊥) or returns an updated heap together with either a value or an exception (1).

We believe that this semantic correspondence is the true power of full effect types and it enables effective equational reasoning about the code by a programmer. For almost all other existing programming languages, even the most basic semantics immediately include complex effects like heap manipulation and divergence. In contrast, Koka allows a layered semantics where we can easily separate out nicely behaved parts, which is essential for many domains, like safe LINQ queries, parallel tasks, tier-splitting, sand-boxed mobile code, etc.

🔗2.7. Combining effects

Often, a function contains multiple effects, for example:

fun combine-effects() 
{
  val i = random-int() // non-deterministic
  error("hi")          // exception raising
  combine-effects()    // and non-terminating
}
fun combine-effectsoverview/combine-effects: forall<a> () -> <pure,ndet> a() 
{
  val ii: int = random-intstd/core/random-int: () -> ndet int() // non-deterministic
  errorstd/core/error: forall<a> (string) -> exn a("hi")          // exception raising
  combine-effectsoverview/combine-effects: () -> <exn,ndet,div|_2744> _2725()    // and non-terminating
}

The effect assigned to combine-effectsoverview/combine-effects: forall<a> () -> <pure,ndet> a are ndetstd/core/ndet: X, divstd/core/div: X, and exnstd/core/exn: X. We can write such combination as a row of effects as <divstd/core/div: X,exnstd/core/exn: X,ndetstd/core/ndet: X>. When you hover over the combine-effectsoverview/combine-effects: forall<a> () -> <pure,ndet> a identifiers, you will see that the type inferred is really <purestd/core/pure: E,ndetstd/core/ndet: X> where purestd/core/pure: E is a type alias defined as

alias pure = <div,exn>
alias purestd/core/pure: E = <divstd/core/div: X,exnstd/core/exn: X>

🔗2.8. Polymorphic effects

Many functions are polymorphic in their effect. For example, the mapstd/core/map: forall<a,b,e> (xs : list<a>, f : (a) -> e b) -> e list<b> function applies a function f to each element of a (finite) list. As such, the effect depends on the effect of f, and the type of map becomes:

map : (xs : list<a>, f : (a) -> e b) -> e list<b>
map : (xs : liststd/core/list: V -> V<a>, f : (a) -> e b) -> e liststd/core/list: V -> V<b>

We use single letters (possibly followed by digits) for polymorphic types. Here, the map functions takes a list with elements of some type a, and a function f that takes an element of type a and returns a new element of type b. The final result is a list with elements of type b. Moreover, the effect of the applied function e is also the effect of the map function itself; indeed, this function has no other effect by itself since it does not diverge, nor raises exceptions.

We can use the notation <l|e> to extend an effect e with another effect l. This is used for example in the whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () function which has type: while : ( pred : () -> <divstd/core/div: X|e> boolstd/core/bool: V, action : () -> <divstd/core/div: X|e> () ) -> <divstd/core/div: X|e> (). The whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () function takes a predicate function and an action to perform, both with effect <divstd/core/div: X|e>. Indeed, since while may diverge depending on the predicate its effect must include divergence.

The reader may be worried that the type of whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () forces the predicate and action to have exactly the same effect <divstd/core/div: X|e>, which even includes divergence. However, when effects are inferred at the call-site, both the effects of predicate and action are extended automatically until they match. This ensures we take the union of the effects in the predicate and action. Take for example the following loop

fun main() { looptest() }

fun looptest() 
{
  while { odd(random-int()) } 
  {
    error("<b>")
  }
}
fun looptestoverview/looptest: () -> <pure,ndet> ()() 
{
  whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () { oddstd/core/odd: (i : int) -> bool(random-intstd/core/random-int: () -> ndet int()) } 
  {
    errorstd/core/error: forall<a> (string) -> exn a("<b>")
  }
}

In the above program, Koka infers that the predicate odd(random-int()) has effect <ndetstd/core/ndet: X|e1> while the action has effect <exnstd/core/exn: X|e2> for some e1 and e2. When applying whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> (), those effects are unified to the type <exnstd/core/exn: X,ndetstd/core/ndet: X,divstd/core/div: X|e3> for some e3 (which can be seen by hovering over the looptestoverview/looptest: () -> <pure,ndet> () identifier)

🔗2.9. Isolated state

The Fibonacci numbers are a sequence where each subsequent Fibonacci number is the sum of the previous two, where fib(0) == 0 and fib(1) == 1. We can easily calculate Fibonacci numbers using a recursive function:

fun main() { println(fib(10)) }

fun fib(n : int) : div int
{
  if (n <= 0)   then 0
  elif (n == 1) then 1
  else fib(n - 1) + fib(n - 2)
}
fun fiboverview/fib: (n : int) -> div int(nn: int : intstd/core/int: V) : divstd/core/div: X intstd/core/int: V
{
  if (nstd/core/True: bool <=std/core/(<=).1: (int, int) -> bool 0)   then 0
  elif (nstd/core/True: bool ==std/core/(==).1: (int, int) -> bool 1) then 1
  else fiboverview/fib: (n : int) -> div int(nn: int -std/core/(-): (int, int) -> int 1) +std/core/(+): (int, int) -> int fiboverview/fib: (n : int) -> div int(nn: int -std/core/(-): (int, int) -> int 2)
}

Note that the type inference engine is currently not powerful enough to prove that this recursive function always terminates, which leads to inclusion of the divergence effect divstd/core/div: X in the result type.

Here is another version of the Fibonacci function but this time implemented using local state. We use the repeat function to iterate n times:

fun main() { println(fib2(10)) }

fun fib2(n) 
{
  var x := 0
  var y := 1
  repeat(n) {
    val y0 = y
    y := x+y
    x := y0
  }
  return x
}
fun fib2overview/fib2: (n : int) -> int(nn: int) 
{
  var xx: ref<_1026,int> := 0
  var yy: ref<_1026,int> := 1
  repeatstd/core/repeat.1: forall<e> (n : int, action : () -> e ()) -> e ()(nn: int) {
    val y0y0: int = yy: ref<_1026,int>
    yy: ref<_1026,int> :=std/core/set: forall<h,a> (ref : ref<h,a>, assigned : a) -> (write<h>) () xx: ref<_1026,int>+std/core/(+): (int, int) -> intyy: ref<_1026,int>
    xx: ref<_1026,int> :=std/core/set: forall<h,a> (ref : ref<h,a>, assigned : a) -> (write<h>) () y0y0: int
  }
  returnreturn: int xx: ref<_1026,int>
}

The var declaration declares a variable that can be assigned too using the (:=) operator. In contrast, a regular equality sign, as in y0 = y introduces an immutable value y0. For clarity, one can actually write val y0 = y for such declaration too but we usually leave out the val keyword.

Local variables declared using var are actually syntactic sugar for allocating explicit references to mutable cells. A reference to a mutable integer is allocated using r = ref(0) (since the reference itself is actually a value!), and can be dereferenced using the bang operator, as !r. The desugared version of our previously Fibonacci function can be written using explicit references as

fun main() { println(fib3(10)) }

fun fib3(n) 
{
  val x = ref(0)
  val y = ref(1)
  repeat(n) {
    val y0 = !y
    y := !x + !y
    x := y0
  }
  return !x
}
fun fib3overview/fib3: (n : int) -> int(nn: int) 
{
  val xx: ref<_1470,int> = refstd/core/ref: forall<h,a> (value : a) -> (alloc<h>) ref<h,a>(0)
  val yy: ref<_1470,int> = refstd/core/ref: forall<h,a> (value : a) -> (alloc<h>) ref<h,a>(1)
  repeatstd/core/repeat.1: forall<e> (n : int, action : () -> e ()) -> e ()(nn: int) {
    val y0y0: int = !std/core/(!).1: forall<h,a,e> (ref : ref<h,a>) -> <read<h>|e> a with hdiv<h,a,e>yy: ref<_1470,int>
    yy: ref<_1470,int> :=std/core/set: forall<h,a> (ref : ref<h,a>, assigned : a) -> (write<h>) () !std/core/(!).1: forall<h,a,e> (ref : ref<h,a>) -> <read<h>|e> a with hdiv<h,a,e>xx: ref<_1470,int> +std/core/(+): (int, int) -> int !std/core/(!).1: forall<h,a,e> (ref : ref<h,a>) -> <read<h>|e> a with hdiv<h,a,e>yy: ref<_1470,int>
    xx: ref<_1470,int> :=std/core/set: forall<h,a> (ref : ref<h,a>, assigned : a) -> (write<h>) () y0y0: int
  }
  returnreturn: int !std/core/(!).1: forall<h,a,e> (ref : ref<h,a>) -> <read<h>|e> a with hdiv<h,a,e>xx: ref<_1470,int>
}

As we can see, using var declarations is quite convenient since such declaration automatically adds a dereferencing operator to all occurrences except on the left-hand side of an assignment.

When we look at the types inferred for the references, we see that x and y have type refstd/core/ref: (H, V) -> V<h,intstd/core/int: V> which stands for a reference to a mutable value of type intstd/core/int: V in some heap h. The effects on heaps are allocation as heap<h>, reading from a heap as readstd/core/read: H -> X<h> and writing to a heap as writestd/core/write: H -> X<h>. The combination of these effects is called stateful and denoted with the alias ststd/core/st: H -> E<h>.

Clearly, the effect of the body of fib3overview/fib3: (n : int) -> int is ststd/core/st: H -> E<h>; but when we hover over fib3overview/fib3: (n : int) -> int, we see the type inferred is actually the totalstd/core/total: E effect: (n:intstd/core/int: V) -> intstd/core/int: V. Indeed, even though fib3overview/fib3: (n : int) -> int is stateful inside, its side-effects can never be observed. It turns out that we can safely discard the ststd/core/st: H -> E<h> effect whenever the heap type h cannot be referenced outside this function, i.e. it is not part of an argument or return type. More formally, the Koka compiler proves this by showing that a function is fully polymorphic in the heap type h and applies the runstd/core/run: forall<e,a> (action : forall<h> () -> <st<h>|e> a) -> e a function (corresponding to runST in Haskell) to discard the ststd/core/st: H -> E<h> effect.

The Garsia-Wachs algorithm is nice example where side-effects are used internally across function definitions and data structures, but where the final algorithm itself behaves like a pure function, see the lib/demo/garsiaWachs.kk example in the distribution.

🔗2.10. A larger example: cracking Caesar encoding

Enough about effects and imperative updates. Let's look at some more functional examples :-) For example, cracking Caesar encoded strings:

fun main() { test-uncaesar() }

fun encode( s : string, shift : int )
{
  function encode-char(c) {
    if (c < 'a' || c > 'z') return c
    val base = (c - 'a').int
    val rot  = (base + shift) % 26
    (rot.char + 'a')
  }

  s.map(encode-char)
}

// The letter frequency table for English
val english = [8.2,1.5,2.8,4.3,12.7,2.2,
               2.0,6.1,7.0,0.2,0.8,4.0,2.4,
               6.7,7.5,1.9,0.1, 6.0,6.3,9.1,
               2.8,1.0,2.4,0.2,2.0,0.1]

// Small helper functions
fun percent( n : int, m : int ) {
  100.0 * (n.double / m.double)
}

fun rotate( xs, n ) {
  xs.drop(n) + xs.take(n)
}

// Calculate a frequency table for a string
fun freqs( s : string ) : list<double>
{
  val lowers = list('a','z')
  val occurs = lowers.map( fun(c){ s.count(c.string) })
  val total  = occurs.sum
  occurs.map( fun(i){ percent(i,total) } )
}

// Calculate how well two frequency tables match according 
// to the _chi-square_ statistic.
fun chisqr( xs : list<double>, ys : list<double> ) : double
{
  zipwith(xs,ys, fun(x,y){ ((x - y)^2.0)/y } ).sum()
}

// Crack a Caesar encoded string
fun uncaesar( s : string ) : string
{
  val table  = freqs(s)                   // build a frequency table for `s`
  val chitab = list(0,25).map( fun(n) {   // build a list of chisqr numbers for each shift between 0 and 25
                  chisqr( table.rotate(n), english ) 
               })
  val min    = chitab.minimum()           // find the mininal element
  val shift  = chitab.index-of( fun(f){ f == min } ).negate  // and use its position as our shift
  s.encode( shift )
}
  
fun test-uncaesar() {
  println( uncaesar( "nrnd lv d ixq odqjxdjh" ) )
}
// The letter frequency table for English
val englishoverview/english: list<double> = [std/core/Cons: forall<a> (head : a, tail : list<a>) -> list<a>8.2,1.5,2.8,4.3,12.7,2.2,
               2.0,6.1,7.0,0.2,0.8,4.0,2.4,
               6.7,7.5,1.9,0.1, 6.0,6.3,9.1,
               2.8,1.0,2.4,0.2,2.0,0.1]std/core/Nil: forall<a> list<a>

// Small helper functions
fun percentoverview/percent: (n : int, m : int) -> double( nn: int : intstd/core/int: V, mm: int : intstd/core/int: V ) {
  100.0 *std/core/(*).1: (double, double) -> double (nn: int.doublestd/core/double: (int) -> double /std/core/(/): (double, double) -> double mm: int.doublestd/core/double: (int) -> double)
}

fun rotateoverview/rotate: forall<a> (xs : list<a>, n : int) -> list<a>( xsxs: list<_2398>, nn: int ) {
  xsxs: list<_2398>.dropstd/core/drop: forall<a> (xs : list<a>, n : int) -> list<a>(nn: int) +std/core/(+).3: forall<a> (xs : list<a>, ys : list<a>) -> list<a> xsxs: list<_2398>.takestd/core/take: forall<a> (xs : list<a>, n : int) -> list<a>(nn: int)
}

// Calculate a frequency table for a string
fun freqsoverview/freqs: (s : string) -> list<double>( ss: string : stringstd/core/string: V ) : liststd/core/list: V -> V<doublestd/core/double: V>
{
  val lowerslowers: list<char> = liststd/core/list.2: (lo : char, hi : char) -> total list<char>('a','z')
  val occursoccurs: list<int> = lowerslowers: list<char>.mapstd/core/map.4: forall<a,b,e> (xs : list<a>, f : (a) -> e b) -> e list<b>( fun(cc: char){ ss: string.countstd/core/count: (s : string, pattern : string) -> int(cc: char.stringstd/core/string.1: (c : char) -> string) })
  val totaltotal: int  = occursoccurs: list<int>.sumstd/core/sum: (xs : list<int>) -> int
  occursoccurs: list<int>.mapstd/core/map.4: forall<a,b,e> (xs : list<a>, f : (a) -> e b) -> e list<b>( fun(ii: int){ percentoverview/percent: (n : int, m : int) -> double(ii: int,totaltotal: int) } )
}

// Calculate how well two frequency tables match according 
// to the _chi-square_ statistic.
fun chisqroverview/chisqr: (xs : list<double>, ys : list<double>) -> double( xsxs: list<double> : liststd/core/list: V -> V<doublestd/core/double: V>, ysys: list<double> : liststd/core/list: V -> V<doublestd/core/double: V> ) : doublestd/core/double: V
{
  zipwithstd/core/zipwith: forall<a,b,c,e> (xs : list<a>, ys : list<b>, f : (a, b) -> e c) -> e list<c>(xsxs: list<double>,ysys: list<double>, fun(xx: double,yy: double){ ((xx: double -std/core/(-).1: (double, double) -> double yy: double)^std/core/(^).1: (d : double, p : double) -> double2.0)/std/core/(/): (double, double) -> doubleyy: double } ).sumstd/core/sum.1: (xs : list<double>) -> double()
}

// Crack a Caesar encoded string
fun uncaesaroverview/uncaesar: (s : string) -> string( ss: string : stringstd/core/string: V ) : stringstd/core/string: V
{
  val tabletable: list<double>  = freqsoverview/freqs: (s : string) -> list<double>(ss: string)                   // build a frequency table for `s`
  val chitabchitab: list<double> = liststd/core/list: (lo : int, hi : int) -> total list<int>(0,25).mapstd/core/map.4: forall<a,b,e> (xs : list<a>, f : (a) -> e b) -> e list<b>( fun(nn: int) {   // build a list of chisqr numbers for each shift between 0 and 25
                  chisqroverview/chisqr: (xs : list<double>, ys : list<double>) -> double( tabletable: list<double>.rotateoverview/rotate: forall<a> (xs : list<a>, n : int) -> list<a>(nn: int), englishoverview/english: list<double> ) 
               })
  val minmin: double    = chitabchitab: list<double>.minimumstd/core/minimum.1: (xs : list<double>) -> double()           // find the mininal element
  val shiftshift: int  = chitabchitab: list<double>.index-ofstd/core/index-of: forall<a> (xs : list<a>, pred : (a) -> bool) -> int( fun(ff: double){ ff: double ==std/core/(==).2: (double, double) -> bool minmin: double } ).negatestd/core/negate: (i : int) -> int  // and use its position as our shift
  ss: string.encodeoverview/encode: (s : string, shift : int) -> string( shiftshift: int )
}
  
fun test-uncaesaroverview/test-uncaesar: () -> console ()() {
  printlnstd/core/println: (s : string) -> console ()( uncaesaroverview/uncaesar: (s : string) -> string( "nrnd lv d ixq odqjxdjh" ) )
}

The val keyword declares a static value. In the example, the value englishoverview/english: list<double> is a list of floating point numbers (of type doublestd/core/double: V) denoting the average frequency for each letter. The function freqsoverview/freqs: (s : string) -> list<double> builds a frequency table for a specific string, while the function chisqroverview/chisqr: (xs : list<double>, ys : list<double>) -> double calculates how well two frequency tables match. In the function crack these functions are used to find a shift value that results in a string whose frequency table matches the englishoverview/english: list<double> one the closest – and we use that to decode the string. Let's try it out in the editor!

🔗2.11. Optional and named parameters

Being a function-oriented language, Koka has powerful support for function calls where it supports both optional and named parameters. For example, the function replace-allstd/core/replace-all: (s : string, pattern : string, repl : string) -> string takes a string, a pattern pattern, and a replacement string repl:

fun main() { println(world()) }

fun world() 
{
  replace-all("hi there", "there", "world")  // returns "hi world"
}
fun worldoverview/world: () -> string() 
{
  replace-allstd/core/replace-all: (s : string, pattern : string, repl : string) -> string("hi there", "there", "world")  // returns "hi world"
}

Using named parameters, we can also write the function call as:

fun main() { println(world2()) }

fun world2() 
{
  return "hi there".replace-all( repl="world", pattern="there" )
}
fun world2overview/world2: () -> string() 
{
  returnreturn: string "hi there".replace-allstd/core/replace-all: (s : string, pattern : string, repl : string) -> string( repl="world", pattern="there" )
}

Optional parameters let you specify default values for parameters that do not need to be provided at a call-site. As an example, let's define a function sublistoverview/sublist: forall<a> (xs : list<a>, start : int, len : ?int) -> list<a> that takes a list, a start position, and the length len of the desired sublist. We can make the len parameter optional and by default return all elements following the start position by picking the length of the input list by default:

fun main() { println( ['a','b','c'].sublist(1).string ) }

fun sublist( xs : list<a>, start : int,
                  len : int = xs.length ) : list<a>
{
  if (start <= 0) return xs.take(len)
  match(xs) {
    Nil -> Nil
    Cons(_,xx) -> xx.sublist(start - 1, len)
  }
}
fun sublistoverview/sublist: forall<a> (xs : list<a>, start : int, len : ?int) -> list<a>( xsxs: list<_3153> : liststd/core/list: V -> V<aa: V>, startstart: int : intstd/core/int: V,
                  lenlen: ?int : intstd/core/optional: V -> V = xsxs: list<_3153>.lengthstd/core/length.2: forall<a> (xs : list<a>) -> int ) : liststd/core/list: V -> V<aa: V>
{
  if (startstd/core/True: bool <=std/core/(<=).1: (int, int) -> bool 0) returnreturn: list<_3153> xsxs: list<_3153>.takestd/core/take: forall<a> (xs : list<a>, n : int) -> list<a>(lenlen: int)std/core/(): ()
  match(xsxs: list<_3153>) {
    Nilstd/core/Nil: forall<a> list<a> -> Nilstd/core/Nil: forall<a> list<a>
    Consstd/core/Cons: forall<a> (head : a, tail : list<a>) -> list<a>(_,xxxx: list<_3153>) -> xxxx: list<_3153>.sublistoverview/sublist: forall<a> (xs : list<a>, start : int, len : ?int) -> list<a>(startstart: int -std/core/(-): (int, int) -> int 1, lenlen: int)
  }
}

Hover over the sublistoverview/sublist: forall<a> (xs : list<a>, start : int, len : ?int) -> list<a> identifier to see its full type, where the len parameter has gotten an optional intstd/core/int: V type signified by the question mark: :?int.

🔗2.12. Structs

An important aspect of a function-oriented language is to be able to define rich data types over which the functions work. A common data type is that of a struct or record. Here is an example of a struct that contains information about a person:

struct person( age : int,
               name : string,
               realname : string = name )

val gaga = Person( 25, "Lady Gaga" )
struct personoverview/person: V( ageage: int : intstd/core/int: V,
               namename: string : stringstd/core/string: V,
               realnamerealname: string : stringstd/core/string: V = namename: string )

val gagaoverview/gaga: person = Personoverview/Person: (age : int, name : string, realname : ?string) -> person( 25, "Lady Gaga" )

Every struct (and other data types) come with constructor functions to create instances, as in Personoverview/Person: (age : int, name : string, realname : string) -> person(25,gagaoverview/gaga: person). Moreover, these constructors can use named arguments so we can also call the constructor as Personoverview/Person: (age : int, name : string, realname : string) -> person( name = "Lady Gaga", age = 25, realname = "Stefani Joanne Angelina Germanotta" ) which is quite close regular record syntax but without any special rules; it is just functions all the way down!

Also, Koka automatically generates accessor functions for each field in a struct (or other data type), and we can access the ageoverview/age: (person : person) -> int of a personoverview/person: V as gaga.age (which is of course just syntactic sugar for age(gaga)).

🔗2.13. Copying

By default, all structs (and other data types) are immutable. Instead of directly mutating a field in a struct, we usually return a new struct where the fields are updated. For example, here is a birthdayoverview/birthday: (p : person) -> person function that increments the ageoverview/age: (person : person) -> int field:

fun main() { println( gaga.birthday.age ) }

struct person( age : int, name : string, realname : string = name )

val gaga = Person( 25, "Lady Gaga" )

fun birthday( p : person ) : person  
{
  return p( age = p.age + 1 )
}
fun birthdayoverview/birthday: (p : person) -> person( pp: person : personoverview/person: V ) : personoverview/person: V  
{
  returnreturn: person pp: person( age = pp: person.ageoverview/age: (person : person) -> int +std/core/(+): (int, int) -> int 1 )
}

Here, birthdayoverview/birthday: (p : person) -> person returns a fresh personoverview/person: V which is equal to p but with the ageoverview/age: (person : person) -> int incremented. The syntax p(...) is sugar for calling the copy constructor of a personoverview/person: V. This constructor is also automatically generated for each data type, and is basically defined as:

fun main() { println( gaga.copy().age ) }

struct person( age : int, name : string, realname : string = name )

val gaga = Person( 25, "Lady Gaga" )

fun copy( p, age = p.age, name = p.name,
               rname = p.realname ) 
{
  return Person(age, name, rname) 
}
fun copyoverview/copy: (p : person, age : ?int, name : ?string, rname : ?string) -> person( pp: person, ageage: ?int = pp: person.ageoverview/age: (person : person) -> int, namename: ?string = pp: person.nameoverview/name: (person : person) -> string,
               rnamername: ?string = pp: person.realnameoverview/realname: (person : person) -> string ) 
{
  returnreturn: person Personoverview/Person: (age : int, name : string, realname : ?string) -> person(ageage: int, namename: string, rnamername: string) 
}

When arguments follow a data value, as in p( age = age + 1), it is desugared to call this copy function, as in p.copy( age = p.age+1 ). Again, there are no special rules for record updates and everything is just function calls with optional and named parameters.

🔗2.14. More data types

Koka also supports algebraic data types where there are multiple alternatives. For example, here is an enumeration:

type colors {
  Red
  Green
  Blue
}
type colors {
  Red
  Green
  Blue
}

Special cases of these enumerated types are the voidstd/core/void: V type which has no alternatives (and therefore there exists no value with this type), the unit type ()std/core/(): V which has just one constructor, also written as () (and therefore, there exists only one value with the type ()std/core/(): V, namely ()), and finally the boolean type boolstd/core/bool: V with two constructors Truestd/core/True: bool and Falsestd/core/False: bool.

type void
type () {
  ()
}
type bool {
  False
  True
}
type voidstd/core/void: V
type () {
  ()
}
type boolstd/core/bool: V {
  Falsestd/core/False: bool
  Truestd/core/True: bool
}

Constructors can have parameters. For example, here is how to create a number type which is either an integer or the infinity value:

type number { 
  Infinity
  Integer( i : int )
}
type number { 
  Infinity
  Integer( i : intstd/core/int: V )
}

We can create such number by writing integer(1) or infinity. Moreover, data types can be polymorphic and recursive. Here is the definition of the liststd/core/list: V -> V type which is either empty (Nilstd/core/Nil: forall<a> list<a>) or is a head element followed by a tail list (Consstd/core/Cons: forall<a> (head : a, tail : list<a>) -> list<a>):

type list<a> {
  Nil
  Cons( head : a, tail : list<a> )
}
type liststd/core/list: V -> V<a> {
  Nilstd/core/Nil: forall<a> list<a>
  Consstd/core/Cons: forall<a> (head : a, tail : list<a>) -> list<a>( head : a, tail : liststd/core/list: V -> V<a> )
}

Koka automatically generates accessor functions for each named parameter. For lists for example, we can access the head of a list as Consstd/core/Cons: forall<a> (head : a, tail : list<a>) -> list<a>(1,Nilstd/core/Nil: forall<a> list<a>).head.

We can now also see that struct types are just syntactic sugar for regular a type with a single constructor of the same name as the type. For example, our earlier personoverview/person: V struct, defined as

struct person( age : int, name : string,
               realname : string = name )
struct personoverview/person: V( age : intstd/core/int: V, name : stringstd/core/string: V,
               realname : stringstd/core/string: V = name )

desugars to:

type person { 
  Person( age : int, name : string, realname : string = name ) 
}
type personoverview/person: V { 
  Personoverview/Person: (age : int, name : string, realname : string) -> person( age : intstd/core/int: V, name : stringstd/core/string: V, realname : stringstd/core/string: V = name ) 
}

🔗2.15. Matching

Todo

🔗2.16. Inductive, co-inductive, and recursive types

For the purposes of equational reasoning and termination checking, a type declaration is limited to finite inductive types. There are two more declarations, namely cotype and rectype that allow for co-inductive types, and arbitrary recursive types respectively.

🔗3. Koka language specification

This is the draft language specification of the Koka language, version 0.7.
Currently only the lexical and context-free grammar are specified. The standard libraries are documented separately.

🔗3.1. Lexical syntax

We define the grammar and lexical syntax of the language using standard BNF notation where non-terminals are generated by alternative patterns:

nonterm ::= pattern1 | pattern2

In the patterns, we use the following notations:

terminal A terminal symbol
x0A A character with hexadecimal code 0A
a..f The characters from a to f
 
( pattern ) Grouping
[ pattern ] Optional occurrence of pattern
{ pattern } Zero or more occurrences of pattern
pattern1 | pattern2 Choice: either pattern1 or pattern2
 
pattern<!diff> Difference: elements generated by pattern except those in diff
pattern<postfix> Postfix: any elements in pattern that are directly followed by postfix
<prefix>pattern Prefix: any elements in pattern that directly follow prefix
nonterm[lex] Generate nonterm by drawing lexemes from lex

Care must be taken to distinguish meta-syntax such as | and ) from concrete terminal symbols as | and ). In the specification the order of the productions is not important and at each point the longest matching lexeme is preferred. For example, even though function is a reserved word, the word functions is considered a single identifier. A prefix or postfix pattern is included when considering a longest match.

🔗3.1.1. Source code

Source code consists of a sequence of 8-bit characters. Valid characters in actual program code consists strictly of ASCII characters which range from 0 to 127 and can be encoded in 7-bits. Only comments, string literals, and character literals are allowed to contain extended 8-bit characters.

🔗3.1.2. Encoding

A program source is assumed to be UTF-8 encoded which allows comments, string literals, and character literals to contain (encoded) unicode characters. Moreover, the grammar is designed such that a lexical analyzer and parser can directly work on source files without doing UTF-8 decoding or unicode category identification. To further facilitate the processing of UTF-8 encoded files the lexical analyzer ignores an initial byte-order mark that some UTF-8 encoders insert. In particular, any program source is allowed to start with three byte-order mark bytes 0xEF, 0xBB, and 0xBF, which are ignored.

🔗3.2. Lexical grammar

In the specification of the lexical grammar all white space is explicit and there is no implicit white space between juxtaposed symbols. The lexical token stream is generated by the non-terminal lex which consists of lexemes and whitespace.

Before doing lexical analysis, there is a linefeed character inserted at the start and end of the input, which makes it easier to specify line comments and directives.

🔗3.2.1. Lexical tokens

lex ::= lexeme | whitespace
lexeme   ::= conid | qconid
| varid | qvarid
| op | opid | qopid | wildcard
| natural | float | string | char
| reserved | opreserved
| special | funanon

The main program consists of whitespace or lexeme's. The context-free grammar will draw it's lexemes from the lex production.

🔗3.2.2. Identifiers

anyid ::= varid | qvarid | opid | qopid | conid | qconid
 
qconid ::= modulepath conid
qvarid ::= modulepath lowerid
modulepath ::= lowerid / { lowerid / }
 
conid ::= upperid
varid ::= lowerid<!reserved>
 
lowerid ::= lower idtail
upperid ::= upper idtail
wildcard ::= _ idtail
typevarid ::= letter { digit }
 
idtail ::= { idchar } [ idfinal ]
idchar ::= letter | digit | _ | -
idfinal ::= ? | { ' }
 
funanon ::= (fun | function)<|(> (anonymous functions must be followed by a ( or <))
reserved ::= infix | infixr | infixl | prefix
| type | cotype | rectype | alias
| forall | exists | some
| fun | function | val | var | con
| if | then | else | elif | match | return
| module | import | as
| public | private | abstract
| interface | instance | with
| external | inline | include
| effect | handle | handler | linear
| Yieldstd/core/Yield: forall<a,b> (op : operation<b>, cont : (b) -> yld<a>) -> yld<a> | qualified | hiding (future reserved words)

Identifiers always start with a letter, may contain underscores and dashes, and can end with a question mark or primes. Like in Haskell, constructors always begin with an uppercase letter while regular identifiers are lowercase. The rationale is to visibly distinguish constants from variables in pattern matches. Here are some example of valid identifiers:

x
concat1
visit-left
nil?
x'
Cons
True  
x
concat1
visit-left
nil?
x'
Consstd/core/Cons: forall<a> (head : a, tail : list<a>) -> list<a>
Truestd/core/True: bool  

To avoid confusion with the subtraction operator, the occurrences of dashes are restricted in identifiers. After lexical analysis, only identifiers where each dash is surrounded on both sides with a letter are accepted:

fold-right
n-1        // illegal, a digit cannot follow a dash
n - 1      // n minus 1
n-x-1      // illegal, a digit cannot follow a dash
n-x - 1    // identifier "n-x" minus 1
n - x - 1  // n minus x minus 1

Qualified identifiers are prefixed with a module path. Module paths can be partial as long as they are unambiguous.

core/map
std/core/(&)

🔗3.2.3. Operators and symbols

qopid ::= modulepath opid
opid ::= ( symbols )
 
op ::= symbols<! opreserved|optype> | ||
 
symbols ::= symbol { symbol }| /
symbol ::= $ | % | & | * | +
| ~ | ! | \ | ^ | #
| = | . | : | - | ?
| anglebar
anglebar ::= < | > | |
 
opreserved ::= = | . | : | ->
optype ::= anglebar anglebar { anglebar }
 
special ::= { | } | ( | ) | [ | ] | | | ; | , | lapp | lidx
lapp ::= <apply>(
lidx ::= <apply>[
apply ::= ) | ] | anyid
 

🔗3.2.4. Literals

string ::= @" { graphic<"> | utf8 | space | tab | newline | "" } " (raw string)
| " { graphic<"|\> | utf8 | space | escape } "
char ::= ' ( graphic<'|\> | utf8 | space | escape ) '
 
escape ::= \ ( charesc | hexesc )
charesc ::= n | r | t | \ | " | '
hexesc ::= x hexdigit2 | u hexdigit4 | U hexdigit4 hexdigit2
hexdigit4 ::= hexdigit hexdigit hexdigit hexdigit
hexdigit2 ::= hexdigit hexdigit
 
float ::= decimal . decimal [ exponent ]
exponent ::= (e | E) [ - | + ] decimal
natural ::= decimal | 0   (x | X) hexadecimal
decimal ::= digit { digit }
hexadecimal ::= hexdigit { hexdigit }

🔗3.2.5. White space

whitespace ::= white { white } | newline
white ::= space
| linecomment | blockcomment
| linedirective
 
linecomment ::= // { linechar }
linedirective ::= newline # { linechar }
linechar ::= graphic | utf8 | space | tab
 
blockcomment ::= /* blockpart { blockcomment blockpart } */ (allows nested comments)
blockpart ::= blockchars<blockchars (/*|*/blockchars>
blockchars ::= { blockchar }
blockchar ::= graphic | utf8 | space | tab | newline

🔗3.2.6. Character classes

letter ::= upper | lower
upper ::= A..Z
lower ::= a..z
digit ::= 0..9
hexdigit ::= a..f | A..F | digit
 
newline ::= [ return ] linefeed (windows or unix style end of line)
 
space ::= x20 (a space)
tab ::= x09 (a tab (\t))
linefeed ::= x0A (a line feed (\n))
return ::= x0D (a carriage return (\r))
graphic ::= x21..x7E (a visible character)
 
utf8 ::= xC0 x80 (encoded 0 character)
| (xC2..xDF) cont
| xE0 (xA0..xBF) cont
| (xE1..xEC) cont cont
| xED (x80..x9F) cont
| (xEE..xEF) cont cont
| xF0 (x90..xBF) cont cont
| (xF1..xF3) cont cont cont
| xF4 (x80..x8F) cont cont
cont ::= x80..xBF

🔗3.2.7. Semicolons?

Many languages have rules to avoid writing semicolons to separate statements. Even though most of the time these rules are quite intuitive to use, their actual definition can be quite subtle. For example, both Scala and Go require remembering specific tokens to know precisely when semicolon insertion takes place. In the case of JavaScript and Haskell (I am sad to admit) the precise behavior is bizarrely complex where semicolon insertion depends on the interaction between the lexer and parser.

In Koka, the grammar is carefully constructed to not need any statement separator at all and semicolons are never required by the grammar. They are still allowed in the grammar but strictly to help new programmers that are used to putting semicolons at the end of statements.

The construction of a grammar that does not need statement separators is also good from a human perspective. The reason semicolons are required is to resolve ambiguities in the syntax. When such ambiguities do not occur in the first place, that also removes a cognitive burden from the programmer. In particular, Koka statements often start with a keyword, like val or match, signifying intention to both programmer and parser.

In other cases, we restrict the expression grammar. For example, one reason why C requires semicolons is due to prefix- and postfix operators. If we write p ++ q the C parser needs a semicolon in order to know if we meant p++; q or p; ++q. Such ambiguity is resolved in Koka by not having postfix operators and restricting prefix operators to ! and ~.

One other reason that Koka can do without a statement separator is the effect inference system: without such effect inference subtle bugs may occur if we leave out semicolons. For example, consider the following function:

fun square( x : int ) {
  x * x
}
fun squarespec/square: (x : int) -> int( xx: int : intstd/core/int: V ) {
  xx: int *std/core/(*): (int, int) -> int xx: int
}

which returns the square of x. Suppose now that we forgot to put in the multiplication operation, giving:

fun square-wrong( x : int ) {
  x x
}
fun square-wrong( x : intstd/core/int: V ) {
  x x
}

The Koka grammar sees this as 2 separate statements now, i.e. as x; x returning x instead. In a language without effect inference it is hard to detect such errors, but the Koka type system rejects this program:

> fun square-wrong(x:int) { x x }
                            ^
((1),27): error: expression has no effect and is unused
  hint: did you forget an operator? or is there a space between an application?

🔗3.2.8. Implementation

There is a full Flex (Lex) implementation of lexical analysis, Ultimately, the Flex implementation serves as the specification, and this document and the Flex implementation should always be in agreement.

🔗3.3. Context-free syntax

The grammar specification starts with the non terminal module which draws its lexical tokens from lex where all whitespace tokens are implicitly ignored.

🔗3.3.1. Modules

module[ lex ] ::= [ moduledecl ] modulebody
 
moduledecl ::= semis [ visibility ] module moduleid
moduleid ::= qvarid | varid
 
modulebody ::= { semis declarations } semis
| semis declarations
 
visibility ::= public | private
semis ::= { ; }

🔗3.3.2. Top level declarations

declarations ::= { import semis } { fixitydecl semis } { topdecl semis }
 
import ::= [ visibility ] import [ moduleid = ] moduleid
 
fixitydecl ::= [ visibility ] fixity natural identifier { , identifier }
fixity ::= infixl
| infixr
| infix
 
topdecl ::= [ visibility ] puredecl
| [ visibility ] aliasdecl
| [ visibility ] typedecl
| [ visibility ] externdecl
| abstract typedecl

🔗3.3.3. Type declarations

aliasdecl ::= alias typeid [ typeparams ] [ kannot ] = type
typedecl ::= typesort [ typemod ] typeid [ typeparams ] [ kannot ] [ typebody ]
| struct typeid [ typeparams ] [ kannot ] [ conparams ]
| effect [ linear ] typeid [ typeparams ] [ kannot ] [ opdecls ]
typesort ::= type | cotype | rectype
typemod ::= open | extendstd/core/extend: (slice : sslice, count : int) -> sslice
 
typeid ::= varid
| []
| ( { , } )
| < >
| < | >
 
typeparams ::= < [ tbinders ] >
tbinders ::= tbinder { , tbinder }
tbinder ::= varid [ kannot ]
typebody ::= { semis { constructor semis } }
 
constructor ::= [ con ] conid [ typeparams ] [ conparams ]
conparams ::= lparen [ conparam { , conparam } ] )
conparam ::= [ paramid ] : paramtype [ = expr ]
lparen ::= lapp | (
 
opdecls ::= { semis { opdecl semis } }
opdecl ::= [ visibility ] identifier [ typeparams ] opparams [ : tatom ]
opparams ::= lparen [ opparam { , opparam } ] )
opparam ::= [ paramid ] : paramtype

🔗3.3.4. Value and function declarations

puredecl ::= val valdecl
| (fun|function) fundecl
 
valdecl ::= binder = expr
binder ::= identifier [ : type ]
 
fundecl ::= somes funid fundef bodyexpr
fundef ::= [ typeparams ] parameters [ : tresult ] [ qualifier ]
funid ::= identifier
| [ { , } ] (indexing operator)
 
parameters ::= lparen [ parameter { , parameter } ] )
parameter ::= paramid [ : paramtype ] [ = expr ]
paramid ::= identifier | wildcard
paramtype ::= type
| ? type (optional parameter)
 
qidentifier ::= qvarid | qidop | identifier
identifier ::= varid | idop
 
qoperator ::= op
 
qconstructor ::= conid | qconid

🔗3.3.5. Statements

block ::= { semis { statement semis } }
 
statement ::= expr<! funexpr>
| decl
 
decl ::= (fun|function) fundecl
| val apattern = valexpr (local values can use a pattern binding)
| var binder := valexpr

🔗3.3.6. Expressions

bodyexpr ::= -> blockexpr
| block
 
blockexpr ::= expr (implicitly wrapped in a block)
 
expr ::= returnexpr
| matchexpr
| handlerexpr
| funexpr
| ifexpr
| opexpr
 
ifexpr ::= if atom then { elif } [ else expr<!ifexpr> ]
then ::= [ then ] expr<!ifexpr>
elif ::= elif atom then
 
matchexpr ::= match atom { semis { matchrule semis } }
returnexpr ::= return opexpr
funexpr ::= funanon fundef block
| block (zero-argument anonymous function)
handlerexpr ::= handler handlereff handlerpars { handlerrules }
| handle handlereff haction handlerpars { handlerrules }
haction ::= lparen expr )

🔗3.3.7. Operator expressions

For simplicity, we parse all operators as if they are left associative with the same precedence. We assume that a separate pass in the compiler will use the fixity declarations that are in scope to properly associate all operators in an expressions.

opexpr ::= prefix { qoperator prefixexpr }
prefixexpr ::= { ! | ~ } appexpr
 
appexpr ::= appexpr lapp [ arguments ] ) (regular application)
| appexpr lidx [ arguments ] ] (index operation)
| appexpr { funexpr } (apply function expressions)
| appexpr . atom
| atom
 
arguments ::= argument { , argument }
argument ::= [ identifier = ] expr

🔗3.3.8. Atomic expressions

atom ::= qidentifier
| qconstructor
| literal
| ( ) (unit)
| ( annexpr ) (parenthesized expression)
| ( annexprs ) (tuple expression)
| [ [ annexpr { , annexprs } [ , ] ] ] (list expression)
 
literal ::= natural | float | char | string
 
annexprs ::= annexpr { , annexpr }
annexpr ::= expr [ : typescheme ]

🔗3.3.9. Matching

matchrule ::= patterns | expr -> blockexpr
| patterns bodyexpr
 
apattern ::= pattern [ typescheme ]
pattern ::= identifier
| wildcard
| qconstructor [lparen [ patargs ] )]
| ( [ apatterns ] ) (unit, parenthesized pattern, tuple pattern)
| [ [ apatterns ] ] (list pattern)
| apattern as identifier (named pattern)
| literal
 
patterns ::= pattern { , pattern }
apatterns ::= apattern { , apattern }
patargs ::= patarg { , patarg }
patarg ::= [ identifier = ] apattern (possibly named parameter)

🔗3.3.10. Handlers

handlerrules ::= semis handlerrule { semis handlerrule } semis
handlerrule ::= qidentifier opargs bodyexpr
| return (lparen oparg ) | paramid) bodyexpr
 
opargs | lparen [ oparg { , oparg } ] )
oparg | paramid [ : type ]

🔗3.3.11. Type schemes

typescheme ::= somes foralls tarrow [ qualifier ]
type ::= foralls tarrow [ qualifier ]
 
foralls ::= [ forall typeparams ]
some ::= [ some typeparams ]
 
qualifier ::= with ( predicates )
 
predicates ::= predicate { , predicate }
predicate ::= typeapp (interface)

🔗3.3.12. Types

tarrow ::= tatom [ -> tresult ]
tresult ::= tatom [ tbasic ]
 
tatom ::= tbasic
| < anntype { , anntype } [ | tatom ] >
| < >
 
tbasic ::= typeapp
| ( ) (unit type)
| ( tparam ) (parenthesized type or type parameter)
| ( tparam { , tparam } ) (tuple type or parameters)
| [ anntype ] (list type)
 
typeapp ::= typecon [ < anntype { , anntype } > ]
 
typecon ::= varid | qvarid
| wildcard
| ( , { , } ) (tuple constructor)
| [ ] (list constructor)
| ( -> ) (function constructor)
 
tparam ::= [ varid : ] anntype
anntype ::= type [ kannot ]

🔗3.3.13. Kinds

kannot ::= :: kind
 
kind ::= ( kind { , kind } ) -> kind
| katom -> kind
| katom
 
katom ::= V (value type)
| X (effect type)
| E (effect row)
| H (heap type)
| P (predicate type)

🔗3.3.14. Implementation

As a companion to the Flex lexical implementation, there is a full Bison(Yacc) LALR(1) implementation available. Again, the Bison parser functions as the specification of the grammar and this document should always be in agreement with that implementation.

References

[1]Daniel Hillerström, and Sam Lindley. “Liberating Effects with Rows and Handlers.” Mar. 2016. Draft. 🔎
[2]Daan Leijen. “Koka: Programming with Row Polymorphic Effect Types.” In Mathematically Structured Functional Programming 2014. EPTCS. Mar. 2014. arXiv:1406.2061🔎
[3]Daan Leijen. Algebraic Effects for Functional Programming. {MSR-TR-2016-29}. Microsoft Research. Aug. 2016. https://​www.​microsoft.​com/​en-​us/​research/​publication/​algebraic-​effects-​for-​functional-​programming🔎
[4]Nicolas Wu, Tom Schrijvers, and Ralf Hinze. “Effect Handlers in Scope.” In Proceedings of the 2014 ACM SIGPLAN Symposium on Haskell, 1–12. Haskell ’14. ACM, New York, NY, USA. 2014. doi:10.1145/2633357.2633358🔎

Appendix

🔗A. Full grammar specification

🔗A.1. Lexical syntax

🔗A.2. Context-free syntax