John C. Platt, John Shawe-Taylor, and Nello Cristianini
We present a new learning architecture: the Decision Directed Acyclic Graph (DDAG), which is used to combine many twoclass classifiers into a multiclass classifier. For an Nclass problem, the DDAG contains N(N-1)/2 classifiers, one for each pair of classes. We present a VC analysis of the case when the node classifiers are hyperplanes; the resulting bound on the test error depends on N and on the margin achieved at the nodes, but not on the dimension of the space. This motivates an algorithm, DAGSVM, which operates in a kernelinduced feature space and uses twoclass maximal margin hyperplanes at each decisionnode of the DDAG. The DAGSVM is substantially faster to train and evaluate than either the standard algorithm or Max Wins, while maintaining comparable accuracy to both of these algorithms.
In Proc. Advances in Neural Information Processing Systems 12