Semi-Supervised Learning with Conditional Harmonic Mixing

  • Chris J.C. Burges ,
  • John Platt

in Semi-Supervised Learning

Published by MIT Press | 2006 | Semi-Supervised Learning edition

Recently graph-based algorithms, in which nodes represent data points and links encode similarities, have become popular for semi-supervised learning. In this chapter we introduce a general probabilistic formulation called `Conditional Harmonic Mixing’, in which the links are directed, a conditional probability matrix is associated with each link, and where the numbers of classes can vary from node to node. The posterior class probability at each node is updated by minimizing the KL divergence between its distribution and that predicted by its neighbours. We show that for arbitrary graphs, as long as each unlabeled point is reachable from at least one training point, a solution always exists, is unique, and can be found by solving a sparse linear system iteratively. This result holds even if the graph contains loops, or if the conditional probability matrices are not consistent. We show how, given a classifier for a task, CHM can learn its transition probabilities. Using the Reuters database, we show that CHM improves the accuracy of the best available classifier, for small training set sizes.