Amplifying Community Content Creation with Mixed-Initiative Information Extraction

  • Raphael Hoffmann ,
  • ,
  • Kayur Patel ,
  • Fei Wu ,
  • James Fogarty ,
  • Daniel S. Weld

In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI 2009) |

Published by ACM

Best Paper Honorable Mention

Publication

Although existing work has explored both information extraction and community content creation, most research has focused on them in isolation. In contrast, we see the greatest leverage in the synergistic pairing of these methods as two interlocking feedback cycles. This paper explores the potential synergy promised if these cycles can be made to accelerate each other by exploiting the same edits to advance both community content creation and learning-based information extraction. We examine our proposed synergy in the context of Wikipedia infoboxes and the Kylin information extraction system. After developing and refining a set of interfaces to present the verification of Kylin extractions as a non-primary task in the context of Wikipedia articles, we develop an innovative use of Web search advertising services to study people engaged in some other primary task. We demonstrate our proposed synergy by analyzing our deployment from two complementary perspectives: (1) we show we accelerate community content creation by using Kylin’s information extraction to significantly increase the likelihood that a person visiting a Wikipedia article as a part of some other primary task will spontaneously choose to help improve the article’s infobox, and (2) we show we accelerate information extraction by using contributions collected from people interacting with our designs to significantly improve Kylin’s extraction performance.