Dong Yu, Yun-Cheng Ju, and Alex Acero
In this paper we propose a novel, effective, and efficient utterance verification (UV) technology for access control in the interactive voice response (IVR) systems. The key of our approach is to construct a context-free grammar by using the secret answer to a question and a word N-gram based filler model. The N-gram filler provides rich alternatives to the secret answer and can potentially improve the accuracy of the UV task. It can also absorb carrier words used by callers and thus can improve the robustness. We also propose using a predictor based on the best alternative to calculate the confidence. We show detailed experimental results on a tough UV test set that contains 930 positive and 930 negative cases and discuss types of questions that are suitable for the UV task. We demonstrate that our approach can achieve a 2.14% equal error rate (EER) on average and 0.8% false accept rate if the false reject rate is 2.6% and above. This is a 49% EER reduction compared with the approaches using acoustic fillers, and a 72% EER reduction compared with the posterior probability based confidence measurement. Index Terms: utterance verification, filler model, word spotting, confidence measure
|Published in||Proc. of the Interspeech Conference|
|Publisher||International Speech Communication Association|
© 2007 ISCA. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the ISCA and/or the author.