Omar Alonso, Catherine C. Marshall, and Marc Najork
This paper describes an approach to improving the reliability of a crowdsourced labeling task for which there is no objective right answer. Our approach focuses on three contingent elements of the labeling task: data quality, worker reliability, and task design. We describe how we developed and applied this framework to the task of labeling tweets according to their interestingness. We use in-task CAPTCHAs to identify unreliable workers, and measure inter-rater agreement to decide whether subtasks have objective or merely subjective answers.
In Human Computation 2013
Copyright (c) 2013, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.