Voice Search of Structured Media Data

Young-In Song, Ye-Yi Wang, Yun-Cheng Ju, Mike Seltzer, Ivan Tashev, and Alex Acero

Abstract

This paper addresses the problem of using unstructured queries to search a structured database in voice search applications. By incorporating structural information in music metadata, the end-to-end search error has been reduced by 15% on text queries and up to 11% on spoken queries. Based on that, an HMM sequential rescoring model has reduced the error rate by 28% on text queries and up to 23% on spoken queries compared to the baseline system. Furthermore, a phonetic similarity model has been introduced to compensate speech recognition errors, which has improved the end-to-end search accuracy consistently across different levels of speech recognition accuracy.

Details

Publication typeInproceedings
Published inInternational Conference on Acoustics, Speech and Signal Processing
AddressTaipei, Taiwan
PublisherInstitute of Electrical and Electornic Engineers, Inc.
> Publications > Voice Search of Structured Media Data