Ye-Yi Wang and Alex Acero
It is a conventional wisdom in the speech community that better speech recognition accuracy is a good indicator for better spoken language understanding accuracy, given a fixed understanding component. The findings in this work reveal that this is not always the case. More important than word error rate reduction, the language model for recognition should be trained to match the optimization objective for understanding. In this work, we applied a spoken language understanding model as the language model in speech recognition. The model was obtained with an example-based learning algorithm that optimized the understanding accuracy. Although the speech recognition word error rate is 46% higher than the trigram model, the overall slot understanding error can be reduced by as much as 17%.
|Published in||IEEE Workshop on Automatic Speech Recognition and Understanding|
|Address||St. Thomas, US Virgin Islands|
|Publisher||Institute of Electrical and Electronics Engineers, Inc.|
© 2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.