Infer.NET user guide : Tutorials and examples : Multi-class classification
Multi-class Bayes Point Machine
Page 1 | Page 2
Here, we present the details for implementing a multiclass Bayes point machine using Infer.NET. In addition, we describe the method for training and testing the model.
During the training phase, we use the labeled data to find the (distribution over) weight vectors for each class. During the testing phase, the goal is to find the posterior distribution over the class labels for unlabeled data. Hence, we specify two models, a trainModel and a testModel that can be used during training and testing, respectively. The distribution over the weights are shared by setting the prior distribution over the weights of the testModel to be the same as the posterior distribution over the trainModel.
Model for training
During training, all examples are labeled. So, we partition training examples according to their assigned label. For each class, we define the data variables, including the number of items in that class, range operator for these items, and the feature vectors (represented using Vector) for items in that class:
The weight vectors corresponding to each class have VectorGaussian priors. The prior distribution is set during the start of training phase.
A data point belonging to a particular class, k, should have maximal score (score is defined as the inner product between the example and the weight vector + some noise) under that class. To ascertain this, we constrain that the examples have maximal score only under the class it is being labeled. In the snippet below, BPMUtils.ComputeClassScores, computes the scores of the example under all classes.BPMUtils.ConstrainArgMax constrains that this score is maximum for the class the example belongs to.
Training the model
The above model can be nicely encapsulated as trainModel, which can then be used to train, either in batch mode, or incrementally.
Training in Batch mode
In this case, we first initialize trainModel.wInit using a standard VectorGaussian except for the first class which we set to a VectorGaussian point mass in order to remove an uncessary degree of freedom.
Then, we set all the observed values and perform inference to infer the posterior over w. After inference we reset the prior to the inferred posterior distribution so that we can use this prior to perform incremental training, and also use during the testing phase.
Training the model in incremental mode:
In this case, we use the posterior distribution inferred during the previous learning stage as the prior distribution for the w, and perform inference to obtain updated posterior distribution over w.
Next, we see how to specify the model for testing and use it to perform prediction.
Page 1 | Page 2