Working with different inference algorithms

Infer.NET 2.3 Beta supports three inference algorithms. These are:
  • Expectation Propagation (EP), a generalized form of loopy belief propagation
  • Variational Message Passing (VMP), a generalized form of mean field approximation
  • Gibbs sampling (including block Gibbs sampling). This is in experimental form and currently does not work with some factors including Gates and/or array factors.

All these algorithms are powerful and general, but no one is best for all problems. Therefore Infer.NET gives you a choice of which algorithm to use. EP and VMP are deterministic in the sense that they involve no random sampling and will always give the same answer for the same model. They are also both approximate, but in different ways. Exact inference is usually impossible for practical problems, involving intractable integrals over high dimensional spaces for continuous variables, or involving prohibitively large numbers of summations for discrete variables. These algorithms work by breaking the problem down into small parts, approximating the parts, and then fusing the results together. Each algorithm minimizes a different cost function and so they may give different results. However these deterministic message-passing algorithms are very efficient and can be applied to very large scale data sets.

Gibbs sampling is a special case of the Metropolis-Hastings algorithm. From a given state of the variables it randomly moves to another state, preserving the desired equilibrium distribution. The simplest version is single-site Gibbs sampling, where each variable is visited in turn and sampled from its conditional distribution given the current state of its neighbors. Variables may be visited in any order and some variables may be sampled more than others, as long as each variable gets sampled sufficiently often.

A more efficient version of Gibbs sampling, especially in the case of deterministic constraints, is the block Gibbs sampler. Here the variables are grouped into acyclic blocks, each block is visited in turn, and the entire block is sampled from its conditional distribution given the state of its neighbors. This procedure is valid even if blocks overlap; variables in multiple blocks will simply be sampled more often.

The algorithm is a property of the InferenceEngine object and can be set as follows:

// Use Expectation Propagation
InferenceEngine ie = new InferenceEngine();
ie.Algorithm = new ExpectationPropagation();
// Use Variational Message Passing
InferenceEngine ie = new InferenceEngine();
ie.Algorithm = new VariationalMessagePassing();

Gibbs sampling also allows setting two properties:

  • BurnIn: number of samples to discard at the beginning
  • Thin: reduction factor when constructing sample and conditional lists in order to avoid correlated samples

Also, Gibbs sampling will typically require many more samples.

// Use Gibbs sampling
GibbsSampling
gs = new GibbsSampling();
gs.BurnIn = 100;
gs.Thin = 10;
InferenceEngine ie = new InferenceEngine(gs);
ie.NumberOfIterations = 2000;

Typically you will not have to specify the blocking for Gibbs sampling as default blocking is automatically based on the deterministic factors and constraints in your graphical model. However, you do have the option of explicitly specifying blocks using the Group method on the inference engine.

EP versus VMP

The main difference between these algorithms is VMP is mode seeking and will tend to avoid placing probabilty mass in regions of low probability, whereas EP will tend to average across modes and is quite happy placing probabilty mass in regions of low probability. In other words, if there are multiple solutions to your problem, e.g. multiple ways of setting parameters to fit the training data, VMP will pick one of these solutions, while EP will average them together. For some variables, this averaging effect is beneficial and for others it is not. In the future, we plan to allow individual variables to be tagged with algorithms. But in the meantime, you need to pick the one algorithm that best suits the variables in your model or divide your model into sections using shared variables.

Although both algorithms can handle a large range of factors, there are some factors which are more naturally and tractably handled by one or other of the algorithms. For example, for mixture models, which try to model multi-model posteriors, you may be better off using VMP. On the other hand, EP handles hard constraints such as clipped Gaussians. EP is the default algorithm choice for Infer.NET. The list of factors and constraints shows which factors each algorithm supports, and can help guide your algorithm selection.

Further reading on EP can be found at http://research.microsoft.com/~minka/papers/ep/roadmap.html. Further reading on VMP can be found at http://www.johnwinn.org/Research/VMP.html.

EP and VMP are in fact both part of a more general class of algorithms referred to as Power EP which will be available in a later release, and which greatly increases the class of problems for which there are tractable solutions.

Last modified at 8/3/2009 10:00 AM  by John Guiver 
©2009 Microsoft Corporation. All rights reserved.  Terms of Use | Trademarks | Privacy Statement