I am a Senior Researcher at Microsoft Research, Cambridge. I help Microsoft make better programming languages, and, through that, make people more productive and, hopefully, happier.
My main current responsibility is the design and implementation of F# (blog), though I've also worked on C# (being co-responsible for C# and .NET generics) and, indirectly, Visual Basic and other .NET languages.
As a researcher, my area is programming language design and implementation, with emphasis on making functional languages that are simpler to use, interoperate well with other languages and which incorporate aspects of object-oriented, asynchronous and parallel programming. I am particularly interested in programming language perspectives on type inference, concurrency, reactivity, pattern matching and language-oriented programming. I also work extensively with teams in the Microsoft Developer Division on other programming-related technologies.
I am the primary author of Expert F#, published in 2007, and we are now working on a second edition of this book. In the past I have worked in formal specification, interactive proof, automated verification and proof description languages. I have a PhD from the University of Cambridge and am a member of the WG2.8 working group on functional programming.
F# Tutorial and Talk Today at TechDays, Paris (Late Announcement!)
There is an F# tutorial and an F# talk today at TechDays, Paris, presented jointly with Adam Granicz and Robert Pickering.
The tutorial is at 13:30-14:30, Feb 9 and will be a light introduction to F#
The talk is at 17:30-18:30, Feb 9, and will include a section from Adam Granicz, the founder of Intellifactory, on WebSharper, the F# framework for rich web development.
Exploring the Simplicity and Power of F# - Parallel and Rich Internet Applications
Abstract: F# is a simple, easy-to-use functional language that's part of Visual Studio 2010. You can use it for all sorts of things, from general purpose .NET scripting to implementing parallel algorithms and modelling financial contracts. This talk will take a high-level tour through some of the uses of F# that will surprise and delight you: from simple CPU and I/O parallelization to LINQ queries and running code on a GPU. We'll look at the magic of F# “expression trees” (quotations), which allow you to run F# code in novel ways, e.g. on a GPU, or as Javascript in a browser. As part of the presentation we'll take a look at Intellifactory's WebSharper platform, an application of these techniques which gives a seamless and simple way to program Rich Internet Applications (RIAs) with F#.
If you're at TechDays, then we look forward to seeing you there!
Tue, 09 Feb 2010 03:23:00 GMT
F# Seminar at University of Washington, Seattle, Thursday, Jan 28
I'll be giving a talk at UW in Seattle on Thursday, Jan 28, this week.
Hope to see you there!
Don Syme (Microsoft Research, Cambridge)Host: Dan GrossmanParallel and Asynchronous Programming with F#CSE 520 ColloquiumThursday, January 28, 20103:30pm,
Abstract
F# is a succinct and expressive typed functional programming language in the context of a modern, applied software development environment (.NET), and Microsoft will be supporting F# as a first class language in Visual Studio 2010. F# makes three primary contributions to parallel, asynchronous and reactive programming in the context of a VM-based platform such as .NET:
(a) functional programming greatly reduces the amount of explicit mutation used by the programmer for many programming tasks
(b) F# includes a powerful "async" construct for compositional reactive and parallel computations, including both parallel I/O and CPU computations, and
(c) "async" enables the definition and execution of lightweight agents without an adjusted threading model on the virtual machine.
In this talk, we will look at F# in general, including some general coding, and take a deeper look at each of these contributions and why they matter
Tue, 26 Jan 2010 04:36:00 GMT
F# Seminar Tomorrow, Tuesday, 26/1, at Berkeley
I'll be giving a seminar tomorrow, Tuesday, at Berkeley, visitng Benjamin Hindman and Rastislav Bodik. The talk will be from 1:00 pm to 2:00 pm - room 320 in Soda Hall (moved from room 511). All welcome!
Title: Parallel and Asynchronous Programming with F#
Abstract: F# is a succinct and expressive typed functional programming language in the context of a modern, applied software development environment (.NET), and Microsoft will be supporting F# as a first class language in Visual Studio 2010. F# makes three primary contributions to parallel, asynchronous and reactive programming in the context of a VM-based platform such as .NET:
(a) functional programming greatly the amount of explicit mutation used by the programmer for many programming tasks
(b) F# includes an “async” construct for compositional reactive and parallel computations, including both parallel I/O and CPU computations, and
(c) “async” enables the definition and execution of lightweight agents without an adjusted threading model on the virtual machine.
In this talk, we’ll look at F# in general, including some general coding, and take a deeper look at each of these contributions and why they matter.
Mon, 25 Jan 2010 09:48:00 GMT
Async and Parallel Design Patterns in F#: Reporting Progress with Events (plus Twitter Sample)
In this post we will look at a common async design pattern I call Reporting Progress With Events. Later in this post we use this design pattern to read a sample stream of tweets from Twitter.
This is the second part of a series covering basic techniques in F# async programming. Some of the samples are drawn from code in the F# JAOO Tutorial.
Pattern #3: Reporting Progress With Events
Let’s first take a look at an instance of the essence of the design pattern. Below, we define an object to coordinate the parallel execution of a group of asyncs. Each job reports its result as it is finishes, rather than waiting for the collection of results.
The essence of the design pattern is highlighted in yellow:
Ø The current “synchronization context” is captured from the GUI thread in the constructor of the object. This is a handle that allows us to run code and raise events in the GUI context. A private helper function is defined to trigger any F# event. This is not strictly needed but makes code much neater.
Ø One or more events are defined. The events are published as properties, and annotated with [<CLIEvent>] if the object is to be used from other .NET languages.
Ø A background job is started, in this case by specifying an asynchronous workflow which defines the background work to be performed. Async.Start begins an instance of the workflow (though Async.StartWithContinuations is often used instead, as in a later example in this post). The events are raised at appropriate points in the execution of the background job as progress is made.
type AsyncWorker<'T>(jobs: seq<Async<'T>>) =
// Capture the synchronization context to allow us to
// raise events back on the GUI thread
let syncContext = System.Threading.SynchronizationContext.Current
// Check that we are being called from a GUI thread
do if syncContext = null then
failwith "Failed to capture the synchronization context of the calling thread. SynchronizationContext.Current of the calling thread is null. This component is for use from a GUI thread."
// A standard helper to raise an event on the GUI thread
let raiseEventOnGuiThread (event:Event<_>) args =
syncContext.Post((fun _ -> event.Trigger args),state=null)
// This declares an F# event that we can raise
let jobCompleted = new Event<int * 'T>()
/// Start an instance of the work
member x.Start() =
// Mark up the jobs with numbers
let jobs = jobs |> Seq.mapi (fun i job -> (job,i+1))
let work =
Async.Parallel
[ for (job,jobNumber) in jobs ->
async { let! result = job
raiseEventOnGuiThread jobCompleted (jobNumber,result)
return result } ]
Async.Start(work |> Async.Ignore)
/// Raised when a particular job completes
member x.JobCompleted = jobCompleted.Publish
You can now use this component to supervise the execution of a collection of CPU-intensive asyncs:
let rec fib i = if i < 2 then 1 else fib (i-1) + fib (i-2)
let worker =
new AsyncWorker<_>( [ for i in 1 .. 100 -> async { return fib (i % 40) } ] )
worker.JobCompleted.Add(fun (jobNumber, result) ->
printfn "job %d completed with result %A" jobNumber result)
worker.Start()
When run, the progress is reported as each job completes:
job 1 completed with result 1
job 2 completed with result 2
...
job 39 completed with result 102334155
job 77 completed with result 39088169
job 79 completed with result 102334155
There are a number of ways to report results from running background processes. For 90% of cases, the easiest way is that shown above: report results by raising .NET events back on a GUI (or ASP.NET Page Load) thread. This technique fully hides the use of background threading and makes use of entirely standard .NET idioms that will be familiar to any .NET programmer. This ensures that the techniques used to implement your parallel programming are appropriately encapsulated.
Reporting Progress of I/O Asyncs
The Reporting Progress With Events pattern can also be used with I/O asyncs. For example, consider this set of I/O tasks:
open System.IO
open System.Net
open Microsoft.FSharp.Control.WebExtensions
/// Fetch the contents of a web page, asynchronously.
let httpAsync(url:string) =
async { let req = WebRequest.Create(url)
use! resp = req.AsyncGetResponse()
use stream = resp.GetResponseStream()
use reader = new StreamReader(stream)
let text = reader.ReadToEnd()
return text }
let urls =
[ "http://www.live.com";
"http://news.live.com";
"http://www.yahoo.com";
"http://news.yahoo.com";
"http://www.google.com";
"http://news.google.com"; ]
let jobs = [ for url in urls -> httpAsync url ]
let worker = new AsyncWorker<_>(jobs)
worker.JobCompleted.Add(fun (jobNumber, result) ->
printfn "job %d completed with result %A" jobNumber result.Length)
worker.Start()
When run, the progressive results are reported, showing the lengths of each web page:
job 5 completed with result 8521
job 6 completed with result 155767
job 3 completed with result 117778
job 1 completed with result 16490
job 4 completed with result 175186
job 2 completed with result 70362
Some Jobs May Report Multiple, Different Events
In this design pattern, one reason we use an object to encapsulate and supervise the execution of a parallel composition of asyncs is that it makes it simple to enrich the API of supervisor with further events. For example, the code below adds additional events that are raised when all jobs complete, or when an error is detected among any of the jobs, or when the overall composition was successfully cancelled before completion. The highlighted portions show the events that are declared, raised and published.
open System
open System.Threading
open System.IO
open Microsoft.FSharp.Control.WebExtensions
type AsyncWorker<'T>(jobs: seq<Async<'T>>) =
// Capture the synchronization context to allow us to raise events back on the GUI thread
let syncContext = System.Threading.SynchronizationContext.Current
// Check that we are being called from a GUI thread
do match syncContext with
| null -> failwith "Failed to capture the synchronization context of the calling thread. The System.Threading.SynchronizationContext.Current of the calling thread is null. This component is for use from a GUI thread."
| _ -> ()
// A standard helper to raise an event on the GUI thread
let raiseEventOnGuiThread (event:Event<_>) args =
syncContext.Post((fun _ -> event.Trigger args),state=null)
// Each of these lines declares an F# event that we can raise
let allCompleted = new Event<'T[]>()
let error = new Event<System.Exception>()
let canceled = new Event<System.OperationCanceledException>()
let jobCompleted = new Event<int * 'T>()
let cancellationCapability = new CancellationTokenSource()
/// Start an instance of the work
member x.Start() =
// Mark up the jobs with numbers
let jobs = jobs |> Seq.mapi (fun i job -> (job,i+1))
let work =
Async.Parallel
[ for (job,jobNumber) in jobs ->
async { let! result = job
raiseEventOnGuiThread jobCompleted (jobNumber,result)
return result } ]
Async.StartWithContinuations
( work,
(fun res -> raiseEventOnGuiThread allCompleted res),
(fun exn -> raiseEventOnGuiThread error exn),
(fun exn -> raiseEventOnGuiThread canceled exn ),
cancellationCapability.Token)
member x.CancelAsync() =
cancellationCapability.Cancel()
/// Raised when a particular job completes
member x.JobCompleted = jobCompleted.Publish
/// Raised when all jobs complete
member x.AllCompleted = allCompleted.Publish
/// Raised when the composition is cancelled successfully
member x.Canceled = canceled.Publish
/// Raised when the composition exhibits an error
member x.Error = error.Publish
We can make use of these additional events in the usual way, e.g.
let worker = new AsyncWorker<_>(jobs)
worker.JobCompleted.Add(fun (jobNumber, result) ->
printfn "job %d completed with result %A" jobNumber result.Length)
worker.AllCompleted.Add(fun results ->
printfn "all done, results = %A" results )
worker.Start()
The supervised async workflow can support cancellation, as shown in the example above.
Tweet Tweet, Tweet Tweet
The Reporting Progress With Events pattern can be applied to pretty much any background processing component which reports results along the way. In the following example, we use the pattern to encapsulate the background read of a stream of tweets from Twitter (see the Twitter API pages). The sample requires a Twitter account and password. Only one event is raised in this case, though the sample could be extended to raise other events in other conditions.
A version of this sample is included in the F# JAOO Tutorial.
// F# Twitter Feed Sample using F# Async Programming and Event processing
//
#r "System.Web.dll"
#r "System.Windows.Forms.dll"
#r "System.Xml.dll"
open System
open System.Globalization
open System.IO
open System.Net
open System.Web
open System.Threading
open Microsoft.FSharp.Control.WebExtensions
/// A component which listens to tweets in the background and raises an
/// event each time a tweet is observed
type TwitterStreamSample(userName:string, password:string) =
let syncContext = System.Threading.SynchronizationContext.Current
// A standard helper to raise an event on the GUI thread
let raiseEventOnGuiThread (event:Event<_>) args =
syncContext.Post((fun _ -> event.Trigger args),state=null)
let tweetEvent = new Event<_>()
let streamSampleUrl = "http://stream.twitter.com/1/statuses/sample.xml?delimited=length"
/// The background process
let listener =
async { let credentials = NetworkCredential(userName, password)
let req = WebRequest.Create(streamSampleUrl, Credentials=credentials)
use! resp = req.AsyncGetResponse()
use stream = resp.GetResponseStream()
use reader = new StreamReader(stream)
let atEnd = reader.EndOfStream
let rec loop() =
async {
let atEnd = reader.EndOfStream
if not atEnd then
let sizeLine = reader.ReadLine()
let size = int sizeLine
let buffer = Array.zeroCreate size
let _numRead = reader.ReadBlock(buffer,0,size)
let text = new System.String(buffer)
raiseEventOnGuiThread tweetEvent text
return! loop()
}
return! loop() }
/// The cancellation condition
let mutable group = new CancellationTokenSource()
/// Start listening to a stream of tweets
member this.StartListening() = Async.Start(listener, group.Token)
/// Stop listening to a stream of tweets
member this.StopListening() =
group.Cancel();
group <- new CancellationTokenSource()
/// Raised when the XML for a tweet arrives
member this.NewTweet = tweetEvent.Publish
This raises an event each time a tweet occurs from the standard sample stream provided by Twitter, and provides the contents of that tweet. We can listen into this stream as follows:
let userName = "..." // set Twitter user name here
let password = "..." // set Twitter user name here
let twitterStream = new TwitterStreamSample(userName, password)
twitterStream.NewTweet
|> Event.add (fun s -> printfn "%A" s)
twitterStream.StartListening()
twitterStream.StopListening()
When run, a stream of the raw XML for tweets is printed (pretty quickly!). See the Twitter API pages for how this stream is sampled.
If you would like to also parse these tweets, here’s some sample code that does an approximate job of this (though also be aware of the guidance on the Twitter API pages, e.g. that tweets should often be saved or queued before processing when building a high-reliability system)
#r "System.Xml.dll"
#r "System.Xml.Linq.dll"
open System.Xml
open System.Xml.Linq
let xn (s:string) = XName.op_Implicit s
/// The results of the parsed tweet
type UserStatus =
{ UserName : string
ProfileImage : string
Status : string
StatusDate : DateTime }
/// Attempt to parse a tweet
let parseTweet (xml: string) =
let document = XDocument.Parse xml
let node = document.Root
if node.Element(xn "user") <> null then
Some { UserName = node.Element(xn "user").Element(xn "screen_name").Value;
ProfileImage = node.Element(xn "user").Element(xn "profile_image_url").Value;
Status = node.Element(xn "text").Value |> HttpUtility.HtmlDecode;
StatusDate = node.Element(xn "created_at").Value |> (fun msg ->
DateTime.ParseExact(msg, "ddd MMM dd HH:mm:ss +0000 yyyy",
CultureInfo.CurrentCulture)); }
else
None
And combinator programming can be used to pipeline from this stream:
twitterStream.NewTweet
|> Event.choose parseTweet
|> Event.add (fun s -> printfn "%A" s)
twitterStream.StartListening()
And to collect statistics from the stream:
let addToMultiMap key x multiMap =
let prev = match Map.tryFind key multiMap with None -> [] | Some v -> v
Map.add x.UserName (x::prev) multiMap
/// An event which triggers on every 'n' triggers of the input event
let every n (ev:IEvent<_>) =
let out = new Event<_>()
let count = ref 0
ev.Add (fun arg -> incr count; if !count % n = 0 then out.Trigger arg)
out.Publish
twitterStream.NewTweet
|> Event.choose parseTweet
// Build up the table of tweets indexed by user
|> Event.scan (fun z x -> addToMultiMap x.UserName x z) Map.empty
// Take every 20’th index
|> every 20
// Listen and display the average of #tweets/user
|> Event.add (fun s ->
let avg = s |> Seq.averageBy (fun (KeyValue(_,d)) -> float d.Length)
printfn "#users = %d, avg tweets = %g" s.Count avg)
twitterStream.StartListening()
This indexes the tweets by user and determines the average number of tweets from each user in this sample stream, reporting results every 20 successfully parsed tweets:
#users = 19, avg tweets = 1.05263
#users = 39, avg tweets = 1.02564
#users = 59, avg tweets = 1.01695
#users = 79, avg tweets = 1.01266
#users = 99, avg tweets = 1.0101
#users = 118, avg tweets = 1.01695
#users = 138, avg tweets = 1.01449
#users = 158, avg tweets = 1.01266
#users = 178, avg tweets = 1.01124
#users = 198, avg tweets = 1.0101
#users = 218, avg tweets = 1.00917
#users = 237, avg tweets = 1.01266
#users = 257, avg tweets = 1.01167
#users = 277, avg tweets = 1.01083
#users = 297, avg tweets = 1.0101
#users = 317, avg tweets = 1.00946
#users = 337, avg tweets = 1.0089
#users = 357, avg tweets = 1.0084
#users = 377, avg tweets = 1.00796
#users = 396, avg tweets = 1.0101
#users = 416, avg tweets = 1.00962
#users = 435, avg tweets = 1.01149
#users = 455, avg tweets = 1.01099
#users = 474, avg tweets = 1.01266
#users = 494, avg tweets = 1.01215
#users = 514, avg tweets = 1.01167
#users = 534, avg tweets = 1.01124
#users = 554, avg tweets = 1.01083
#users = 574, avg tweets = 1.01045
#users = 594, avg tweets = 1.0101
Using a slightly different analysis we can display those users who have tweeted more than once in the sample stream provided by Twitter, along with their latest tweet. This is executed interactively from F# Interactive and uses the F# Interactive data grid view snippet from a previous post:
open System.Drawing
open System.Windows.Forms
let form = new Form(Visible = true, Text = "A Simple F# Form", TopMost = true, Size = Size(600,600))
let data = new DataGridView(Dock = DockStyle.Fill, Text = "F# Programming is Fun!",
Font = new Font("Lucida Console",12.0f),
ForeColor = Color.DarkBlue)
form.Controls.Add(data)
data.DataSource <- [| (10,10,10) |]
data.Columns.[0].Width <- 200
data.Columns.[2].Width <- 500
twitterStream.NewTweet
|> Event.choose parseTweet
// Build up the table of tweets indexed by user
|> Event.scan (fun z x -> addToMultiMap x.UserName x z) Map.empty
// Take every 20’th index
|> every 20
// Listen and display those with more than one tweet
|> Event.add (fun s ->
let moreThanOneMessage = s |> Seq.filter (fun (KeyValue(_,d)) -> d.Length > 1)
data.DataSource <-
moreThanOneMessage
|> Seq.map (fun (KeyValue(user,d)) -> (user, d.Length, d.Head.Status))
|> Seq.filter (fun (_,n,_) -> n > 1)
|> Seq.sortBy (fun (_,n,_) -> -n)
|> Seq.toArray)
twitterStream.StartListening()
Here are some sample results:
Note: In the above example, we have used blocking I/O to read the Twitter stream. This is adequate for two reasons – the Twitter stream is very active (and probably will remain so for some time J), and we can also assume that there are not many outstanding connections to many Twitter streams – in this case there is only one, and in any case it appears Twitter places limitations on how many times you can listen to the sample stream for an account. In a later post we’ll show how to do a non-blocking read of this kind of stream of XML fragments.
F# for Parallel, C#/VB for GUI
The Reporting Progress With Events pattern is highly useful for the case where the F# programmer implements the background computation components, based on some inputs and the C# or VB programmer uses this component. In this case, the published events should be labeled with [<CLIEvent>] to ensure they appear as .NET events to the C# or VB programmer. For the second example above, you would use
/// Raised when a particular job completes
[<CLIEvent>]
member x.JobCompleted = jobCompleted.Publish
/// Raised when all jobs complete
[<CLIEvent>]
member x.AllCompleted = allCompleted.Publish
/// Raised when the composition is cancelled successfully
[<CLIEvent>]
member x.Canceled = canceled.Publish
/// Raised when the composition exhibits an error
[<CLIEvent>]
member x.Error = error.Publish
Limitations of the Pattern
The Reporting Progress With Events pattern assumes that a parallel processing component is hosted in a GUI application (e.g. Windows Forms), server-side application (e.g. ASP.NET) or some other context where it is possible to raise events back to some supervisor. It is possible to adjust the pattern to raise events in other ways, e.g. to post a message to a MailboxProcessor or simply to log them. However be aware that there is still an assumption in the design pattern that some kind of main thread or supervisor exists that is ready to listen to the events at any moment and queue them sensibly.
The Reporting Progress With Events pattern also assumes that the encapsulating object is able to capture the synchronization context of the GUI thread, normally implicitly (as in the examples above). This is usually a reasonable assumption. Alternatively this context could be given as an explicit parameter, though that is not a very common idiom in .NET programming.
For those familiar with the IObservable interface (added in .NET 4.0), you might have considered having the TwitterStreamSample type implement this interface. However, for root sources of events this doesn’t necessarily gain that much. For example, in the future, the TwitterStreamSample may need to provide multiple events, such as reporting auto-reconnections if errors occur, or reporting pauses or delays. In this scenario, simply raising .NET events is adequate, partly to ensure your object looks familiar to many .NET programmers. In F#, all published IEvent<_> values implement IObservable automatically and can be used directly with observable combinators.
Conclusion
The Reporting Progress With Events is a powerful and elegant way to encapsulate parallel execution behind a boundary while still reporting results and progress.
From the outside, the AsyncWorker object effectively appears single threaded. Assuming your input asyncs are isolated, then this means the component does not expose the rest of your program to multi-threaded race conditions. All users of Javascript, ASP.NET and GUI frameworks like Windows Forms know that single-threadedness of those frameworks is both a blessing and a curse – you get simplicity (no data races!), but parallel and asynchronous programming is hard. In .NET programming, I/O and heavy CPU computations has to be offloaded to background threads. The above design pattern gives you the best of both worlds: you get independent, cooperative, “chatty” background processing components, including ones that do parallel processing and I/O, while maintaining the simplicity of single threaded GUI programming for most of your code. These components can be generic and reusable, like the ones shown above. This makes them amenable to independent unit testing.
In future blog posts we’ll be looking at additional design topics for parallel and reactive programming with F# async, including
Ø defining lightweight async agents
Ø authoring .NET tasks using async
Ø authoring the.NET APM patterns using async
Ø cancelling asyncs
Sun, 10 Jan 2010 23:44:00 GMT
Async and Parallel Design Patterns in F#: Parallelizing CPU and I/O Computations
F# is both a parallel and a reactive language. By this we mean that running F# programs can have both multiple active evaluations (e.g. .NET threads actively computing F# results), and multiple pending reactions (e.g. callbacks and agents waiting to react to events and messages).
One simple way to write parallel and reactive programs is with F# async expressions. In this and future posts, I will cover some of the basic ways in which you can use F# async programming - roughly speaking, these are design patterns enabled by F# async programming. I assume you already know the basics of using async, e.g. see this introductory guide.
We’ll start with two easy design patterns: Parallel CPU Asyncs and Parallel I/O Asyncs.
Pattern #1: Parallel CPU Asyncs
Let’s take a look at an example of our first pattern: Parallel CPU Asyncs, that is, running a set of CPU-bound computations in parallel. The code below computes the Fibonacci function, and schedules the computations in parallel:
let rec fib x = if x <= 2 then 1 else fib(x-1) + fib(x-2)
let fibs =
Async.Parallel [ for i in 0..40 -> async { return fib(i) } ]
|> Async.RunSynchronously
Producing:
val fibs : int array =
[|1; 1; 2; 3; 5; 8; 13; 21; 34; 55; 89; 144; 233; 377; 610; 987; 1597; 2584;
4181; 6765; 10946; 17711; 28657; 46368; 75025; 121393; 196418; 317811;
514229; 832040; 1346269; 2178309; 3524578; 5702887; 9227465; 14930352;
24157817; 39088169; 63245986; 102334155|]
The above code sample shows the elements of the Parallel CPU Asyncs pattern:
(a) “async { … }” is used to specify a number of CPU tasks
(b) These are composed in parallel using the fork-join combinator Async.Parallel
In this case the composition is executed using Async.RunSynchronously, which starts an instance of the async and synchronously waits for the overall result.
You can use this pattern for many routine CPU parallelization jobs (e.g. dividing and parallelizing a matrix multiply), and for batch processing jobs.
Pattern #2: Parallel I/O Asyncs
So far we have only seen parallel CPU-bound programming with F#. One key thing about F# async programming is that you can use it for both CPU and I/O computations. This leads to our second pattern: Parallel I/O Asyncs, i.e. doing I/O operations in parallel (also known as overlapped I/O). For example, the following requests multiple web pages in parallel and reacts to the responses for each request, and returns the collected results.
open System
open System.Net
open Microsoft.FSharp.Control.WebExtensions
let http url =
async { let req = WebRequest.Create(Uri url)
use! resp = req.AsyncGetResponse()
use stream = resp.GetResponseStream()
use reader = new StreamReader(stream)
let contents = reader.ReadToEnd()
return contents }
let sites = ["http://www.bing.com";
"http://www.google.com";
"http://www.yahoo.com";
"http://www.search.com"]
let htmlOfSites =
Async.Parallel [for site in sites -> http site ]
|> Async.RunSynchronously
The above code sample shows the essence of the Parallel I/O Asyncs pattern:
(a) “async { … }” is used to write tasks which include some asynchronous I/O.
(b) These are composed in parallel using the fork-join combinator Async.Parallel
In this case, the composition is executed using Async.RunSynchronously, which synchronously waits for the overall result
Using let! (or its resource-disposing equivalent use!) is one basic way of composing asyncs. A line such as
let! resp = req.AsyncGetResponse()
causes a “reaction” to occur when a response to the HTTP GET occurs. That is, the rest of the async { … } runs when the AsyncGetResponse operation completes. However, no .NET or operating system thread is blocked while waiting for this reaction: only active CPU computations use an underlying .NET or O/S thread. In contrast, pending reactions (for example, callbacks, event handlers and agents) are relatively cheap, often as cheap as a single registered object. As a result you can have thousands or even millions of pending reactions. For example, a typical GUI application has many registered event handlers, and a typical web crawler has a registered handler for each outstanding web request.
In the above, "use!" replaces "let!" and indicates that the resource associated with the web request should be disposed at the end of the lexical scope of the variable.
One of the nice things about I/O parallelization is scaling. With multi-core CPU-bound programming you often see 2x, 4x or 8x speedups if you work hard enough on a many-core machine. With I/O parallel programming you can perform hundreds or thousands of operations in parallel (though actual parallelization depends on your operating system and network connections), giving speedups of 10x, 100x, 1000x or more, even on a single-core machine. For example, see the use of F# asyncs in this nice sample, ultimately called from an Iron Python application.
Many modern applications are I/O bound so it’s important to be able to recognize and apply this design pattern in practice.
Starting on the GUI Thread, finishing on the GUI thread
There is an important variation on both of these design patterns. This is where Async.RunSynchronously is replaced by Async.StartWithContinuations. Here the parallel composition is started and you specify three functions to run when the async completes with success, failure or cancellation.
Whenever you face the problem “I need to get the result of an async but I really don’t want to use RunSynchronously”, then you should consider either:
(a) start the async as part of a larger async by using let! (or use!), or
(b) start the async with Async.StartWithContinuations
Async.StartWithContinuations is very useful when starting asyncs on the GUI thread, since you never want to block the GUI thread, instead you want to schedule some GUI updates to occur when the async completes. For example, this is used in the BingTranslator examples in the F# JAOO Tutorial code. A full version of this sample is shown at the end of this blog post, but the important thing here is to note what happens when the “Translate” button is pressed:
button.Click.Add(fun args ->
let text = textBox.Text
translated.Text <- "Translating..."
let task =
async { let! languages = httpLines languageUri
let! fromLang = detectLanguage text
let! results = Async.Parallel [for lang in languages -> translateText (text, fromLang, lang)]
return (fromLang,results) }
Async.StartWithContinuations(
task,
(fun (fromLang,results) ->
for (toLang, translatedText) in results do
translated.Text <- translated.Text + sprintf "\r\n%s --> %s: \"%s\"" fromLang toLang translatedText),
(fun exn -> MessageBox.Show(sprintf "An error occurred: %A" exn) |> ignore),
(fun cxn -> MessageBox.Show(sprintf "A cancellation error ocurred: %A" cxn) |> ignore)))
In the highlighted parts, the async is specified, and this includes a use of Async.Parallel to translate the input text into multiple languages in parallel. The composite async is started with Async.StartWithContinuations. This unblocks as soon as the async hits its first I/O operation, and specifies three functions to run when the async completes with success, failure or cancellation. Here is a screen shot of the operation after the task completes (no guarantees given about the accuracy of the translation...)
Async.StartWithContinuations has the important property that if the async is started on the GUI thread (i.e. a thread with a non-null SynchronizationContext.Current), then the completion function is called on the GUI thread. This makes it safe to update the results. The F# async library allows you to specify composite I/O tasks and use them from the GUI thread without having to marshal your updates from background threads, a topic we’ll explore in later posts.
Some notes on how Async.Parallel works:
Ø When run, asyncs composed with Async.Parallel are initially started through a queue of pending computations. Ultimately this uses QueueUserWorkItem, like most async processing libraries. It is possible to use a separate queue, something we’ll discuss in later posts.
Ø There is nothing particularly magical about Async.Parallel: you can define your own async combinators that coordinate asyncs in different ways by using other primitives in the Microsoft.FSharp.Control.Async library such as Async.StartChild. We’ll return to this topic in a later post.
More Examples
Example uses of these patterns in the F# JAOO Tutorial code are
Ø BingTranslator.fsx and BingTranslatorShort.fsx: calling a REST API using F#. This is similar to any similar web-based HTTP service. A version of this sample is given below.
Ø AsyncImages.fsx: parallel disk I/O and image processing
Ø PeriodicTable.fsx: calling a web service, fetching atomic weights in parallel
Limitations of the Patterns
The two parallel patterns shown here have some limitations. Notably, an async generated by Async.Parallel is not, when run, “chatty” – for example, it doesn’t report progress or partial results. To handle that we need to build a more chatty object that raises events as partial operations complete. We’ll be looking at that design pattern in later posts.
Also, Async.Parallel handles a fixed number of jobs. In later posts we'll look at many examples where jobs get generated as work progresses. Another way to look at that is that an async generated by Async.Parallel does not immediately accept incoming messages, i.e. it is not an agent whose progress can be directed, apart from cancellation.
Asyncs generated by Async.Parallel do support cancellation. Cancellation is not effective until all the sub-tasks have completed or been effectively cancelled. This is normally what you want.
Conclusion
The Parallel CPU Asyncs and Parallel I/O Asyncs patterns are probably the two simplest design patterns using F# async programming. As often with simple things, they are important and powerful. Note that the only difference between the patterns is that I/O Parallel uses asyncs which include (and are often dominated by) I/O requests, plus some CPU processing to create request objects and to do post-processing.
In future blog posts we’ll be looking at additional design topics for parallel and reactive programming with F# async, including
Ø starting asyncs from the GUI thread
Ø defining lightweight async agents
Ø defining background worker components using async
Ø authoring .NET tasks using async
Ø authoring the.NET APM patterns using async
Ø cancelling asyncs
BingTranslator Code Sample
Here’s the sample code for the BingTranslator example. You’ll need a Live API 1.1 AppID to run it
(NOTE: the samples would need to be adjusted for the Bing API 2.0, notably the language detection API is not present in 2.0, however the code should still act as a good guide)
open System
open System.Net
open System.IO
open System.Drawing
open System.Windows.Forms
open System.Text
/// A standard helper to read all the lines of a HTTP request. The actual read of the lines is
/// synchronous once the HTTP response has been received.
let httpLines (uri:string) =
async { let request = WebRequest.Create uri
use! response = request.AsyncGetResponse()
use stream = response.GetResponseStream()
use reader = new StreamReader(stream)
let lines = [ while not reader.EndOfStream do yield reader.ReadLine() ]
return lines }
type System.Net.WebRequest with
/// An extension member to write content into an WebRequest.
/// The write of the content is synchronous.
member req.WriteContent (content:string) =
let bytes = Encoding.UTF8.GetBytes content
req.ContentLength <- int64 bytes.Length
use stream = req.GetRequestStream()
stream.Write(bytes,0,bytes.Length)
/// An extension member to read the content from a response to a WebRequest.
/// The read of the content is synchronous once the response has been received.
member req.AsyncReadResponse () =
async { use! response = req.AsyncGetResponse()
use responseStream = response.GetResponseStream()
use reader = new StreamReader(responseStream)
return reader.ReadToEnd() }
#load @"C:\fsharp\staging\docs\presentations\2009-10-04-jaoo-tutorial\BingAppId.fs"
//let myAppId = "please set your Bing AppId here"
/// The URIs for the REST service we are using
let detectUri = "http://api.microsofttranslator.com/V1/Http.svc/Detect?appId=" + myAppId
let translateUri = "http://api.microsofttranslator.com/V1/Http.svc/Translate?appId=" + myAppId + "&"
let languageUri = "http://api.microsofttranslator.com/V1/Http.svc/GetLanguages?appId=" + myAppId
let languageNameUri = "http://api.microsofttranslator.com/V1/Http.svc/GetLanguageNames?appId=" + myAppId
/// Create the user interface elements
let form = new Form (Visible=true, TopMost=true, Height=500, Width=600)
let textBox = new TextBox (Width=450, Text="Enter some text", Font=new Font("Consolas", 14.0F))
let button = new Button (Text="Translate", Left = 460)
let translated = new TextBox (Width = 590, Height = 400, Top = 50, ScrollBars = ScrollBars.Both, Multiline = true, Font=new Font("Consolas", 14.0F))
form.Controls.Add textBox
form.Controls.Add button
form.Controls.Add translated
/// An async method to call the language detection API
let detectLanguage text =
async { let request = WebRequest.Create (detectUri, Method="Post", ContentType="text/plain")
do request.WriteContent text
return! request.AsyncReadResponse() }
/// An async method to call the text translation API
let translateText (text, fromLang, toLang) =
async { let uri = sprintf "%sfrom=%s&to=%s" translateUri fromLang toLang
let request = WebRequest.Create (uri, Method="Post", ContentType="text/plain")
request.WriteContent text
let! translatedText = request.AsyncReadResponse()
return (toLang, translatedText) }
button.Click.Add(fun args ->
let text = textBox.Text
translated.Text <- "Translating..."
let task =
async { /// Get the supported languages
let! languages = httpLines languageUri
/// Detect the language of the input text. This could be done in parallel with the previous step.
let! fromLang = detectLanguage text
/// Translate into each language, in parallel
let! results = Async.Parallel [for lang in languages -> translateText (text, fromLang, lang)]
/// Return the results
return (fromLang,results) }
/// Start the task. When it completes, show the results.
Async.StartWithContinuations(
task,
(fun (fromLang,results) ->
for (toLang, translatedText) in results do
translated.Text <- translated.Text + sprintf "\r\n%s --> %s: \"%s\"" fromLang toLang translatedText),
(fun exn -> MessageBox.Show(sprintf "An error occurred: %A" exn) |> ignore),
(fun cxn -> MessageBox.Show(sprintf "A cancellation error ocurred: %A" cxn) |> ignore)))
Sat, 09 Jan 2010 18:35:00 GMT
Email: dsyme ... microsoft ... com
Phone: +44 1223 479806
October 2009 F# Release
Announcing the latest release of the F# compiler, library and Visual Studio tools. Don details the new features of this release.
Mon, 02 Nov 2009 06:20:00 Z
F# in VS2010
S. Somasegar highlights a few of the exciting features that F# brings to VS2010, including a Simple and Succint Syntax, simplified Parallel and Asynchronous Programming and Units of Measure.
Mon, 02 Nov 2009 06:20:00 Z
Online Content for Learning F#
A catalogue of great F# resources from around the web. Brian includes videos, forums and many useful blog posts in his list.
Mon, 02 Nov 2009 06:20:00 Z
- Don Syme, The F# Draft Language Specification, Microsoft, 1 February 2009
- Don Syme, Introduction to F# (recorded lecture), Microsoft, 12 February 2008
- Don Syme, Parallel Functional Programming on .NET with F# (recorded lecture), Microsoft, 12 February 2008
- Don Syme, Adam Granicz, and Antonio Cisternino, Expert F#, Springer Verlag, 10 January 2008
- Don Syme, Gregory Neverov, and James Margetson, Extensible pattern matching via a lightweight language extension, in Proceedings of the 12th ACM SIGPLAN international conference on Functional programming , Association for Computing Machinery, Inc., 3 October 2007
- Don Syme and Gregory Neverov, Combining Total and Ad Hoc Extensible Pattern Matching in a Lightweight Language Extension, no. MSR-TR-2007-33, April 2007
- Don Syme, Initializing Mutually Referential Abstract Objects: The Value Recursion Challenge , in Proceedings of the ACM-SIGPLAN Workshop on ML (2005), Elsevier , 11 March 2006
- Don Syme, Initializing Mutually Referential Abstract Objects (talk slides), Microsoft, 11 March 2006
- Don Syme, An Alternative Approach to Initializing Mutually Referential Objects, no. MSR-TR-2005-31, March 2005
- Carl Seger, Robert Jones, Mark Aagaard, Tom Melham, Clark Barrett, and Don Syme, An Industrially Effective Environment for Hardware Verification, Institute of Electrical and Electronics Engineers, Inc., February 2005
- Andrew J. Kennedy and Don Syme, Transposing F to C#: Expressivity of parametric polymorphism in an object-oriented language, in Concurrency and Computation: Practice and Experience, vol. 16, no. 7, Wiley, June 2004
- Don Syme and Andrew Kennedy, Transposing F to C#: Expressivity of polymorphism in an object-oriented language, Wiley, June 2004
- Don Syme and Andrew Kennedy, Combining Generics, Pre-compilation and Sharing Between Software-Based Processes, January 2004
- Dachuan Yu, Andrew J. Kennedy, and Don Syme, Formalization of Generics for the .NET Common Language Runtime, in POPL '04: Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, ACM Press, January 2004
- Don Syme and Andy Gordon, Automating Type Soundness Proofs via Decision Procedures and Guided Reductions , Springer-Verlag, October 2002
- Don Syme, ILX: Extending the .NET Common IL for Functional Language Interoperability, Springer-Verlag, September 2001
- A. J. Kennedy and D. Syme, Design and Implementation of Generics for the .NET Common Language Runtime, in Programming Language Design and Implementation, ACM Press, January 2001
- Don Syme and Andy Gordon, Typing a multilanguage intermediate code, Association for Computing Machinery, Inc., December 2000
- Donald Syme, Declarative Theorem Proving for Operational Semantics, April 1999
- Donald Syme, Three Tactic Theorem Proving, January 1998
- Don Syme, Interaction for Declarative Theorem Proving, January 1998
- Don Syme, Towords a Machine Checked, Readable Proof of Jave Type Soundness, April 1997
- Donald Syme, DECLARE: A Prototype Declarative Proof System for Higher Order Logic, January 1997
- Don Syme, Proving Java Type Soundness, January 1997
- Donald Syme, A New Interface for HOL - Ideas, Issues and Implementation, Springer-Verlag, January 1995



