Conceptual Set Covering: Improving Fit-and-Split Algorithms

Carl M. Kadie

Microsoft Research, Bldg 9S
Redmond 98052-6399, WA

Author Email: carlk@microsoft.com

Abstract:

Many learning systems implicitly use the fit­and­split learning method to create a comprehensive hypothesis from a set of partial hypotheses. At the core of the fit­and­split method is the assignment of  examples to partial hypotheses. To date, however, this core has been neglected. This paper provides the first definition and model of the fit­and­split assignment problem. Extant systems perform assignment nearly arbitrarily, implicitly using, for example, greedy set covering. This paper also presents Conceptual Set Covering (CSC), a new assignment algorithm. An extensive empirical evaluation over a wide range of learning problems suggests that CSC can improve any fit­and­split learning system.

Proceedings of the Seventh International Conference on Machine Learning, Austin, Texas, June 1990. (postscript)