Search Vox: Leveraging multimodal refinement and partial knowledge for mobile voice search

Internet usage on mobile devices continues to grow as users seek anytime, anywhere access to information. Because users frequently search for businesses, directory assistance has been the focus of many voice search applications utilizing speech as the primary input modality. Unfortunately, mobile settings often contain noise which degrades performance. As such, we present Search Vox, a mobile search interface that not only facilitates touch and text refinement whenever speech fails, but also allows users to assist the recognizer via text hints. Search Vox can also take advantage of any partial knowledge users may have about the business listing by letting them express their uncertainty in an intuitive way using verbal wildcards. In simulation experiments conducted on real voice search data, leveraging multimodal refinement resulted in a 28% relative reduction in error rate. Providing text hints along with the spoken utterance resulted in even greater relative reduction, with dramatic gains in recovery for each additional character.

PDF file

In  Proceedings of User Interface Software and Technology (UIST)

Publisher  Association for Computing Machinery, Inc.
Copyright © 2007 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or The definitive version of this paper can be found at ACM’s Digital Library --


> Publications > Search Vox: Leveraging multimodal refinement and partial knowledge for mobile voice search