Dong Yu, Jinyu Li, and Li Deng
Most speech recognition applications in use today rely heavily on confidence measure for making optimal decisions. In this work, we aim to answer the question: what can be done to improve the quality of confidence measure if we cannot modify the speech recognition engine? The answer provided in this paper is a post-processing step called confidence calibration, which can be viewed as a special adaptation technique applied to confidence measure. Three confidence calibration methods have been developed in this work: the maximum entropy model with distribution constraints, the artificial neural network, and the deep belief network. We compare these approaches and demonstrate the importance of key features exploited: the generic confidence-score, the application-dependent word distribution, and the rule coverage ratio. We demonstrate the effectiveness of confidence calibration on a variety of tasks with significant normalized cross entropy increase and equal error rate reduction.
In IEEE Transactions on Audio, Speech, and Language Processing
Publisher IEEE SPS