Distribution-Calibrated Hierarchical Classification

Ofer Dekel

Distribution-Calibrated Hierarchical Classification

Ofer Dekel

Advances in Neural Information Processing Systems 22 | December 2009

Download BibTex

While many advances have already been made in hierarchical classiﬁcation learning, we take a step back and examine how a hierarchical classiﬁcation problem should be formally deﬁned. We pay particular attention to the fact that many arbitrary decisions go into the design of the label taxonomy that is given with the training data. Moreover, many hand-designed taxonomies are unbalanced and misrepresent the class structure in the underlying data distribution. We attempt to correct these problems by using the data distribution itself to calibrate the hierarchical classiﬁcation loss function. This distribution-based correction must be done with care, to avoid introducing unmanageable statistical dependencies into the learning problem. This leads us off the beaten path of binomial-type estimation and into the unfamiliar waters of geometric-type estimation. In this paper, we present a new calibrated deﬁnition of statistical risk for hierarchical classiﬁcation, an unbiased estimator for this risk, and a new algorithmic reduction from hierarchical classiﬁcation to cost-sensitive classiﬁcation.