Conceptual Set Covering: Improving Fit-and-Split Algorithms

  • Carl Kadie

Published by Morgan Kaufmann Publishers

Many learning systems implicitly use the fit-and-split learning method to create a comprehensive hypothesis from a set of partial hypotheses. At the core of the fit-and-split method is the assignment of examples to partial hypotheses. To date, however, this core has been neglected. This paper provides the first definition and model of the fit-and-split assignment problem. Extant systems perform assignment nearly arbitrarily, implicitly using, for example, greedy set covering. This paper also presents Conceptual Set Covering (CSC), a new assignment algorithm. An extensive empirical evaluation over a wide range of learning problems suggests that CSC can improve any fit-and-split learning system.