Haihua Xu, Daniel Povey, Lidia Mangu, and Jie Zhu
In this paper we describe a method for Minimum Bayes Risk decoding for speech recognition. This is a technique similar to Consensus a.k.a. Confusion Network Decoding, in which we attempt to find the hypothesis that minimizes the Bayes’ Risk with respect to the word error rate, based on a lattice of alternative outputs. Our method is an E-M like technique which makes approximations which we believe are less severe than the approximations made in Consensus, and our experimental results show an improvement in WER both for lattice rescoring and lattice-based system combination, versus baselines such as Consensus, Confusion Network Combination and ROVER.