Tim Paek, Yun-Cheng Ju, and Christopher Meek
Automated Directory Assistance (ADA) allows users to request telephone or address information of residential and business listings using speech recognition. Because callers often express listings differently than how they are registered in the directory, ADA systems require transcriptions of alternative phrasings for directory listings as training data, which can be costly to acquire. As such, a framework in which data can be contributed voluntarily by large numbers of Internet users has tremendous value. In this paper, we introduce People Watcher, a computer game that elicits transcribed, alternative user phrasings for directory listings while at the same time entertaining players. Data generated from the game not only overlapped actual audio transcriptions, but resulted in a statistically significant 15% relative reduction in semantic error rate when utilized for ADA. Furthermore, semantic accuracy was not statistically different than using the actual audio transcriptions.
Publisher International Speech Communication Association
© 2007 ISCA. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the ISCA and/or the author.