Tutorial 3: Learning a GaussianThis tutorial shows how to use ranges to deal with large arrays of data, and how to visualise your model.
You can run the code in this tutorial either using the Examples Browser or by opening the Tutorials solution in Visual Studio and executing LearningAGaussian.cs and LearningAGaussianWithRanges.cs.
Thinking bigBecause real world applications involve large amounts of data, Infer.NET has been designed to work efficiently with large arrays. To exploit this capability, you need to use a VariableArray object rather than an array of Variable objects. This tutorial demonstrates the performance difference between these two options.
In this example, our data will be an array of 100 data points sampled from a Gaussian distribution. This can be achieved using the handy Rand class in the MicrosoftResearch.Infer.Maths namespace which has methods for sampling from a variety of distributions,
The aim will be to estimate the mean and precision (inverse variance) of this data. To do this, we need to create random variables for each of these quantities and give them broad prior distributions:
Variable Now we need to tie these to the data. The simple, but inefficient, approach is to loop across the data observing each data element to be equal to a Gaussian random variable with mean mean and precision precision.
This complete program (LearningAGaussian.cs) executes and gives the correct answer. However, model compilation takes a long time - around 5 seconds. Worse, compilation time will scale linearly with the size of the array and so 1000 points would take ten times longer! This is because the loop is being unrolled internally, which we need to avoid. A speed-upTo handle the array of data efficiently, we need to tell Infer.NET that the data is in an array. We must first create a range to indicate the size of the array. A range is a set of integers from 0 to N-1 for some N which can be created using Range(N) as follows:
We can now use this range to create an array of variables
To refer to each element of the array, we index the array by the range i.e. we write x[dataRange]. We can specify that each element of the array is drawn from Gaussian(mean, precision) using the code:
Finally we attach the data to this array by setting its ObservedValue:
Notice that we have built the model without using a loop, by using the ForEach() construct. In fact, by replacing the loop above, we get a new program (LearningAGaussianWithRanges.cs) which runs much faster - only ~250ms - a twenty-fold speed-up.
Compare the two versions in the Examples Browser with show timings selected to see this for yourself. The general rule is: "Use VariableArray rather than .NET arrays whenever possible." Visualising your modelInfer.NET allows you to visualise the model in which inference is being performed, in the form of a factor graph. To see the factor graph, set the ShowFactorGraph property of the inference engine to true (or select Show: Factor Graph in the examples browser). The factor graph will then be displayed whenever a model is compiled. For the second model above, this gives something like: Random variables appear as circles, factors appear as filled squares, and observed variables or constants appear as bare text. The edges of the graph are labelled in grey to distinguish between different arguments of a factor. For example, the Gaussian factor has arguments named "mean" and "precision" which are labelled in grey (these names are separate from the variable names). The actual mean and precision variables are given the odd names "v1" and "v3" since no names were provided in the code. You can override these names by using Named("newname"), for example
Below is the factor graph when mean, precision and data have been named. Naming variables gives them the correct name in generated code, in error messages and in the transform browser as well as in factor graphs. Names must be unique or you will get an exception when you try to run inference. In the Examples Browser, trying viewing the factor graph for the unrolled example above (LearningAGaussian.cs). You will see what effect unrolling has!
|


