Running inference

Inference is concerned with the calculation of posterior probabilities based on one or more data observations. The word 'posterior' is used to indicate that the calculation occurs after the evidence of the data is taken into account; 'prior' probabilities refer to any initial uncertainty we have. In the section on the Infer.NET modelling API, a simple model was introduced to learn a Gaussian of unknown mean and precision from data. The first two lines of the program showed our initial uncertainty in the mean and precision of our simple Gaussian model (i.e. the prior). Then we introduced some data. Now we would like to infer the posterior probabilities for the mean and precision. The program below shows how to do this:

// The model defined using the modelling API
Variable
<double> mean = Variable.GaussianFromMeanAndVariance(0, 100);
Variable<double> precision = Variable.GammaFromShapeAndScale(1, 1);
VariableArray<double> data = Variable.Observed(new double[] { 11, 5, 8, 9 });
Range i = data.Range;
data[i] =
Variable.GaussianFromMeanAndPrecision(mean, precision).ForEach(i);

// Create an inference engine for VMP
InferenceEngine engine = new InferenceEngine(new VariationalMessagePassing());
// Retrieve the posterior distributions
Gaussian marginalMean = engine.Infer<Gaussian>(mean);
Gamma marginalPrecision = engine.Infer<Gamma>(precision);
Console.WriteLine("mean=" + marginalMean);
Console.WriteLine("prec=" + marginalPrecision);

If you run this, you will get output as follows:

mean=Gaussian(8.165, 1.026)
prec=Gamma(3, 0.08038)

The Gaussian distribution shows the mean and variance of the mean marginal, and the Gamma distribution shows the shape and scale of the precision marginal. The pattern inherent in the this simple program is common to many inference problems. We start with a model which has some structure defined by some unknown parameters (considered as variables), their prior uncertainties, and their relationships. We then provide some observed data. Finally we perform inference and retrieve the posterior marginal distributions of the variables we are interested in.

The term 'marginal' is a common term in statistics which refers to the operation of 'summing out' (in the case of discrete variables) or 'integrating out' (in the case of continuous variables) the uncertainty associated with all other variables in the problem. The inference methods in Infer.NET make heavy use of fully factorised approximations; i.e. approximations which are a product of functions of individual variables. Although the true posteriors may be joint distributions over the participating variables, the approximate posteriors in such models are naturally in the form of single variable marginal distributions. In the example the marginal posteriors, marginalMean and marginalPrecision, are retrieved. Based on the evidence of the data, these distributions are seen to be much tighter than the prior distributions for these variables.

Creating an inference engine

All inferences in Infer.NET are achieved through the use of an InferenceEngine object. When you create this object, you can specify the inference algorithm you wish to use. For example, to create an inference engine that uses Variational Message Passing, you write:

InferenceEngine engine = new InferenceEngine(new VariationalMessagePassing());

In this example above, we explicitly set the algorithm to Variational Message Passing (VMP). Other available algorithms are Expectation Propagation (EP) and Gibbs sampling. Between them, these three algorithms can solve a wide range of inference problems; they are discussed further in working with different inference algorithms.

As well the inference algorithm, the engine has a number of other settings which you can modify to affect how inference is performed and what is displayed during the inference process.

Performing inference

Having created an inference engine, you can use it to infer marginal distributions for variables you are interested in, using the Infer<TReturn>() method. When using this method, you should pass in the type of the distribution you want to be returned. For example, this code asks for the posterior distribution over a variable x as a Gaussian distribution.

Gaussian xPosterior = engine.Infer<Gaussian>(x);

If you are unsure of what type to pass into Infer<TReturn>() then you can instead call Infer() with no type argument to return the default distribution type. However, once you have done this you should switch back to the typed version, since the default return type can vary with the choice of inference algorithm and may change with different versions of the framework.

Important note:
When you call Infer() the inference engine may not actually execute an inference algorithm, if it has already computed the marginal that you are requesting. For instance, in the example above, when Infer<Gaussian>(mean) is called, the inference engine will compute both the posterior over the mean and the posterior over the precision (since this requires no additional computation). Then when Infer<Gamma>(precision) is called, this cached precision is returned and no additional computation is performed.

Which marginals the inference engine calculates in any given call to Infer() is complex and depends on the structure of your model. You can instead choose to be explicit about exactly marginals to be calculated by passing the set of variables using the InferAll() method, for example:

engine.InferAll(a,b,x,y,z);

Calling InferAll() always causes inference to run straight away for exactly the variables listed. InferAll() does not return a value but instead caches the computed marginals. To retrieve these marginals, you call Infer() as normal and it is guaranteed to return straight away with the cached marginal value. Hence, you can make explicit when inference happens by prefixing all calls to Infer() with a single call to InferAll() listing the set of marginals to be computed.

Reasons for using InferAll():

  • You want to remove the (normally small) overhead of calculating additional marginals that will not be used.
  • You want to control when inference happens e.g. to ensure a user interface remains responsive
  • You want to ensure that inference happens in one go, and not in a number of separate runs of the inference algorithm.

Alternative inference queries

So far, we have only considered how to user an inference engine to retrieve marginal distributions. However, some inference algorithms can perform other types of inference query to return other quantities (such as lists of samples from the marginal). Currently, only the Gibbs sampling algorithm supports alternative query types. To perform alternative inference queries, you pass a second argument to Infer() which specifies a QueryType object. The set of built-in query types at provided for convenience on the static class QueryTypes. So to retrieve a set of posterior samples for a variable x, you can write

SampleList<double> postSamples = engine.Infer<SampleList<double>>(x, QueryTypes.Samples);

Last modified at 8/4/2009 11:17 AM  by John Winn 
©2009 Microsoft Corporation. All rights reserved.  Terms of Use | Trademarks | Privacy Statement