Gaussian Process classifier
This page describes an experimental feature that is likely to change in future releases
This example provides an introduction to Gaussian Process modelling in Infer.NET. You can run the code using the Examples Browser. The goal is to build a non-linear Bayes point machine classifier by using a Gaussian Process to define the scoring function. To set up the problem, suppose we have the following data:
Each element of xdata is a vector of input values, and the corresponding element of ydata is the desired output (i.e. label or class) for that vector. As in the linear Bayes point machine, we will first map the input into a real-valued score, then threshold the score to determine the output. The only difference is that the score will be an arbitrary non-linear function of the input vector.
A Gaussian Process is a distribution over functions. In Infer.NET, a function from Vector to double is denoted by the type IFunction. Therefore a random function has type Variable<IFunction>. Such variables can be given a Gaussian Process prior, and when you infer the variable, you get a Gaussian Process posterior. Infer.NET implements an efficient type of Gaussian Process called a sparse Gaussian Process that allows you to control the cost of inference by specifying a basis on which the function will be represented. For the moment, we will skip the details of defining a sparse Gaussian Process and focus on creating and using random functions. Here is the code to create a random function:
To use a random function, you provide an input vector and get a random output value. This is done with Variable.FunctionEvaluate(f,x). In our case, we want to evaluate the function f at the locations provided by inputs, and threshold the output values:
Note that we have added some Gaussian noise to the score before thresholding it, to allow some noise in the labels. The Gaussian Process classification model can easily be changed into a Gaussian Process regression model or other likelihood model simply by changing the line of code that relates the data y to the score.
Gaussian Process distributions
A Gaussian Process distribution is defined by a mean function and a covariance function. The mean function maps Vector to double so it has type IFunction. The covariance function maps two Vectors to double so it has type IKernelFunction. For example, the following creates a Gaussian Process with zero mean function and squared exponential covariance function (length scale = exp(0) = 1):
Infer.NET provides a small set of commonly-used mean and covariance functions (see the Kernels namespace), and it is easy to define your own. You simply have to create a class that implements IFunction or IKernelFunction.
To get a sparse Gaussian Process, we pair a GaussianProcess with a set of basis vectors. The basis vectors are intended to summarize the set of inputs into a smaller set. By changing the size of the basis, you control the cost of the inference. (For details, see the references at the end.) If the basis set is exactly the set of inputs, then the distribution is equivalent to a full (non-sparse) Gaussian Process. A good strategy for computing the basis is to cluster the input vectors. Another approach is to use a random subset of the input vectors. Here for simplicity we will set them by hand to roughly partition the range of the inputs:
Now we have all the pieces in place to infer the random function and make predictions:
Note that we could have built a model for making predictions (as in the Bayes Point Machine tutorial) but here for simplicity we call the Marginal method on the SparseGP posterior to get the distribution of the score at a particular input. (It is also possible to get the joint distribution of the scores at multiple inputs.)
Selecting the covariance function
An important issue in Gaussian Process modelling is choosing the appropriate covariance function (both its type and its parameters such as length scales). One approach is to treat each possible covariance function as a separate model and apply Bayesian model selection. In Infer.NET, it is straightforward to score a model as discussed in Computing model evidence for model selection. For the Gaussian Process classifier, we just wrap all of the model code in an evidence block:
Now we can score different covariance functions against our data by setting prior.ObservedValue to various SparseGP priors. The example does this for 3 different possibilities, giving the following results:
Notice that the model is only compiled once, while inference is repeated three times (once for each prior). In this case, the neural net covariance function provides the best fit, and classifies all of the training data correctly.
References for sparse Gaussian ProcessesL. Csato, M. Opper. "Sparse representation for Gaussian process models." In Advances in Neural Information Processing Systems 13. MIT Press, pp. 444-450, 2000.
Yuan (Alan) Qi, Ahmed H. Abdel-Gawad, and Thomas P. Minka. "Sparse-posterior Gaussian Processes for general likelihoods." In Proceedings of the Twenty-Sixth Conference in Uncertainty in Artificial Intelligence, 2010.