John C. Platt, CCSP Group, Microsoft Research
Nello Cristianini, Department of Engineering Mathematics, University of Bristol
John Shawe-Taylor, Department of Computer Science, Royal Holloway, University of London
Advances in Neural Information Processing Systems 12, pp. 547-553, MIT Press, (2000).
We present a new learning architecture: the Decision
Directed Acyclic Graph (DDAG), which is used to combine many twoclass
classifiers into a multiclass classifier. For an Nclass problem, the DDAG
contains N(N-1)/2 classifiers, one for each pair of classes.
We present a VC analysis of the case when the node classifiers are hyperplanes;
the resulting bound on the test error depends on N and on the margin achieved at
the nodes, but not on the dimension of the space. This motivates an algorithm,
DAGSVM, which operates in a kernelinduced feature space and uses twoclass
maximal margin hyperplanes at each decisionnode of the DDAG. The DAGSVM is
substantially faster to train and evaluate than either the standard algorithm
or Max Wins, while maintaining comparable accuracy to both of these algorithms.
PDF file (84 KB)
PS file (96 KB)