Todd Kulesza, Saleema Amershi, Rich Caruana, Danyel Fisher, and Denis Charles
Labeling data is a seemingly simple task required for training many machine learning systems, but is actually fraught with problems. This paper introduces the notion of concept evolution, the changing nature of a person’s underlying concept (the abstract notion of the target class a person is labeling for, e.g., spam email, travel related web pages) which can result in inconsistent labels and thus be detrimental to machine learning. We introduce two structured labeling solutions, a novel technique we propose for helping people define and refine their concept in a consistent manner as they label. Through a series of five experiments, including a controlled lab study, we illustrate the impact and dynamics of concept evolution in practice and show that structured labeling helps people label more consistently in the presence of concept evolution than traditional labeling.
|Published in||Proceedings of the Conference on Human Factors in Computing Systems (CHI 2014)|