## Running inference

Inference is concerned with the calculation of posterior probabilities based
on one or more data observations. The word 'posterior' is used to indicate that
the calculation occurs **after **the evidence of the data is taken
into account; 'prior' probabilities refer to any initial uncertainty we have. In
the section on
the Infer.NET modelling API, a simple
model was introduced to learn a Gaussian of unknown mean and precision from
data. The first two lines of the program showed our initial uncertainty in the
**mean **and **precision **of our simple Gaussian
model (i.e. the prior). Then we introduced some data. Now we would like to infer
the marginal (see later) posterior probabilities for the **mean **
and **precision**. The program below shows how to do this:

```
// The model defined using
the modelling API
``` |

If you run this, you will get output as follows:

```
mean=Gaussian(8.165, 1.026)
``` |

The Gaussian distribution shows the mean and variance of the mean marginal, and the Gamma distribution shows the shape and scale of the precision marginal. The pattern inherent in the this simple program is common to many inference problems. We start with a model which has some structure defined by some unknown parameters (considered as variables), their prior uncertainties, and their relationships. We then provide some observed data. Finally we perform inference and retrieve the posterior marginal distributions of the variables we are interested in.

The term 'marginal' is a common term in statistics which refers to the
operation of 'summing out' (in the case of discrete variables) or 'integrating
out' (in the case of continuous variables) the uncertainty associated with all
other variables in the problem. The inference methods in Infer.NET make heavy
use of fully factorised approximations; i.e. approximations which are a product
of functions of individual variables. Although the true posteriors may be joint
distributions over the participating variables, the approximate posteriors in
such models are naturally in the form of single variable marginal distributions.
In the example the marginal posteriors, **marginalMean **and
**marginalPrecision**, are retrieved. Based on the evidence of the data,
these distributions are seen to be much tighter than the prior distributions for
these variables. For further background reading on inference, marginal
distributions and fully factorised approximations, plesae refer to
Resources and References.

### Creating an inference engine

All inferences in Infer.NET are achieved through the use of an **InferenceEngine**
object. When you create this object, you can specify the inference
algorithm you wish to use. For example, to create an inference engine that
uses Variational Message Passing, you write:

```
InferenceEngine engine =
new
InferenceEngine(new
VariationalMessagePassing());
``` |

In this example above, we explicitly set the algorithm to Variational Message Passing (VMP). Other available algorithms are Expectation Propagation (EP) and Gibbs sampling. Between them, these three algorithms can solve a wide range of inference problems; they are discussed further in working with different inference algorithms.

As well the inference algorithm, the engine has a number of other settings which you can modify to affect how inference is performed and what is displayed during the inference process.

### Performing inference

Having created an inference engine, you can use it to infer marginal
distributions for variables you are interested in, using the **Infer<TReturn>()**
method. When using this method, you should pass in the type of the
distribution you want to be returned. For example, this code asks for the
posterior distribution over a variable **x** as a Gaussian
distribution.

```
Gaussian xPosterior
= engine.Infer<Gaussian>(x);
``` |

For array variables, you can specify an array type over distributions. Here are some examples, the first one showing inference on a 1-D random variable array where each element of the array has a Gaussian distribution, the second showing inference on a 1-D array of 2-D arrays of random variables where each element has a Gamma distribution.

```
Gaussian[]
x1dArrayPosterior = engine.Infer<Gaussian[]>(x1dArray);
``` |

Note that this syntax is for the usual situation where the random variables have been defined with Variable.Array as described in the sections on variable arrays and jagged variable arrays. If you use .NET arrays for your variables, then you must call infer for each element of the array.

##### Inferring constant and observed variables

Besides random variables, you can call **Infer** on a constant
variable or observed variable. For example, in the above Gaussian model
you can Infer **data** as a Gaussian array:

```
Gaussian[]
dataPosterior = engine.Infer<Gaussian[]>(data);
``` |

The result here will be an array of Gaussians, each a point mass on the
observed value. This is useful when you want to experiment with making
some variables observed vs. random. For example, if you change the
definition of **data** to be Variable.Array<double>(new Range(4))
instead of Variable.Observed then the above call to Infer still
works, and the result is an array of Gaussians which are not point masses.
To minimize the overhead of this feature, there is a restriction on which
observed variables can be inferred. Specifically, you can only infer an
observed variable that has a definition other than its observation. In
other words, Infer.NET must be able to determine what distribution the variable
would have if it had not been observed. In the model above, **data
**satisfies this condition because of the line **data[i] = ...**
If this line was not present, then you would not be able to infer the
distribution of **data**.

##### When does inference happen for a particular variable?

When you call Infer() the inference engine __may not actually execute an
inference algorithm__, if it has already computed the marginal that you are
requesting. For instance, in the example above, when **
Infer<Gaussian>(mean)** is called, the inference engine will compute both
the posterior over the mean *and* the posterior over the precision (since
this requires no additional computation). Then when **
Infer<Gamma>(precision)** is called, this cached precision is returned
and no additional computation is performed.

Which marginals the inference engine calculates in any given call to **
Infer()** is complex and depends on the structure of your model. You can
instead choose to set this explicitly using the **OptimiseForVariables **
property on the inference engine, specifying the set of variables whose
marginals should be calculated. For example:

```
engine.OptimiseForVariables =
new[] {x,y,z};
``` |

After setting this property, the next call to **Infer() **with
one of the listed variables __always causes inference to run straight away__
for all the variables listed (and only those variables). To retrieve
marginals for the other variables in the list, you can then call **Infer()**
as normal and it is guaranteed to return straight away with the cached marginal
value. If you call **Infer()** for a variable not in the
list, you will get an error. To revert to normal inference engine
behaviour, just set **OptimiseForVariables** to null.

Reasons for using **OptimiseForVariables**:

- You want to remove the (normally small) overhead of calculating additional marginals that will not be used.
- You want to control when inference happens e.g. to ensure a user interface remains responsive
- You want to ensure that inference happens in one go, and not in a number of separate runs of the inference algorithm.

### Alternative inference queries

So far, we have only considered how to use an inference engine to retrieve
marginal distributions. However, some inference algorithms can perform
other types of inference query to return other quantities (such as lists of
samples from the marginal). To perform alternative inference queries,
you pass a second argument to **Infer() **which specifies a **
QueryType** object. The set of built-in query types at provided for
convenience on the static class **QueryTypes**. When using
GibbsSampling, you can retrieve a
set of posterior samples for variable **x **via:

```
IList<double>
postSamples = engine.Infer<IList<double>>(x,
QueryTypes.Samples);
``` |

When using QueryTypes, it is a good idea to tell the compiler in advance what QueryTypes you want to infer. Otherwise, inference may re-run on each query. To specify the QueryTypes in advance, attach them as attributes to the variable, like so:

```
x.AddAttribute(QueryTypes.Marginal);
``` |

Notice that in this case, you need to explicitly specify whether you want to infer Marginals.