Infer.NET user guide : Tutorials and examples : Multi-class classification

Sparse Multi-class Bayes Point Machine

Here, we present the details for implementing a sparse multiclass Bayes point machine using Infer.NET. Sparse, in this context, represents the fact every data point has only a subset of relevant features. Hence, a data point is represented by a set of indices (into the feature set) and the corresponding values for these features.

During the training phase, we have labeled data, and the goal is to find the (distribution over) weight vectors for each class. During the testing phase, given unlabeled data, the goal is to find the posterior distribution over the  class labels for the data. Hence, we specify two models, a trainModel and a testModel that can be used during training and testing, respectively.

Model for training

During training, all examples are labeled. So, we group training examples by their label. For a particular class, we have the the number of items nItem, a range over these items, the number of features in each item (represented using a VariableArray), the range over these features, and of course, the values and the indices of the features. This means that the values and features need to be represented using a three-deep jagged array indexed by class, item, and feature, where each outer index is over a range that is dependent on the inner index. The syntax is as follows:

Range c = newRange(nClass).Named("c");
var nItems = Variable.Array<int>(c).Named("nItems");
Range item = newRange(nItems[c]).Named("item");
var xValueCount = Variable.Array(Variable.Array<int>(item),c).Named("xCount");
Range itemFeature = new Range(xValueCount[c][item]).Named("itemFeature");
var xValues = Variable.Array(Variable.Array(Variable.Array<double>(itemFeature),item),c).Named("xValues");
var xIndices = Variable.Array(Variable.Array(Variable.Array<int>(itemFeature),item),c).Named("xIndices");

Weights corresponding to each feature have an independent Gaussian prior that is set during the start of training phase. We require this independence assumption since we want to work with only a subset of these features for each training example.

var w = Variable.Array(Variable.Array<double>(feature),c).Named("w");
var wInit = Variable.Array(Variable.Array<Gaussian>(feature),c).Named("wInit");
w[c][feature] = Variable<double>.Random(wInit[c][feature]);

A data point belonging to a particular class, k, should have maximal score under that class. To ascertain this, we constrain that the examples have  maximal score only under the class it is being labeled to belong. In the snippet below, BPMUtils.ComputeClassScores, computes the scores of the example under all classes. BPMUtils.ConstrainArgMax constrains that this score is maximum for the class the example belongs to. 

using(var cBlock = Variable.ForEach(c))
  var currC = cBlock.Index;
    var score = BPMUtils.ComputeClassScores(w, xValues[c][item], xIndices[c][item], itemFeature, NoisePrec);
    BPMUtils.ConstrainArgMax(currC, score);

Computing class scores for sparse model

Since, only relatively small number of features are pertinent  for any given example, score can be efficiently computed. We can obtain the weights, wSparse, for pertinent features of an item using Variable.SubArray. (Variable.SubArray is efficient when there are unique indices to retrieve. For non-unique indices, use Variable.GetItems). We compute the score by summing (using Variable.Sum) the element-wise product between wSparse and observed feature values.

var score = Variable.Array<double>(c);
var scorePlusNoise = Variable.Array<double>(c);
var wSparse = Variable.Array(Variable.Array<double>(itemFeature), c);
var product = Variable.Array(Variable.Array<double>(itemFeature), c);
wSparse[c]  =Variable.Subarray<double>(w[c],xIndices);
product[c][itemFeature] = xValues[itemFeature] * wSparse[c][itemFeature];
score[c] = Variable.Sum(product[c]);
scorePlusNoise[c] = Variable.GaussianFromMeanAndPrecision(score[c], noisePrec);

Training the model

The above model can be nicely encapsulated as trainModel, which can then be used to train, either in batch mode, or incrementally. h mode, orTraining the model in batch mode

We set the prior over the weights to be standard Gaussians. Then, use the observed data to set the observed values for the variables. Then, inference is performed to obtain the posterior over w. We also reset the prior to the inferred posterior distribution so that we can perform incremental training, and use at the testing time.

var wInferred = (<Gaussian[][]>(trainModel.w);
trainModel.wInit.ObservedValue = wInferred;
return wInferred;

 Training the model in incremental mode

In this case, we use the posterior distribution inferred during the previous learning stage as the prior distribution for the w. Perform inference to obtain updated posterior distribution over w.erior distribution over w.

Using Shared variables

We can easily convert the sparse Bayes Point Machine to make use of shared variables. See Multi-class Bayes Point machine using shared variables for an example.

©2009-2013 Microsoft Corporation. All rights reserved.  Terms of Use | Trademarks | Privacy Statement