Exploiting Variances in Robust Feature Extraction Based on a Parametric Model of Speech Distortion

  • Li Deng ,
  • Jasha Droppo ,
  • Alex Acero

Proc. International Conference on Spoken Language Processing |

This paper presents a technique that exploits the denoised speech’s variance, estimated during the speech feature enhancement process, to improve noise-robust speech recognition. This technique provides an alternative to the Bayesian predictive classification decision rule by carrying out an integration over the feature space instead of over the model-parameter space, offering a much simpler system implementation and lower computational cost. We extend our earlier work by using a new approach, based on a parametric model of speech distortion and thus free from the use of any stereo training data, to statistical feature enhancement, for which a novel algorithm for estimating the variance of the enhanced speech features is developed. Experimental evaluation using the full Aurora2 test data sets demonstrates an 11.4% digit error rate reduction averaged over all noisy and SNR conditions, compared with the best technique we have developed [2] prior to this work that did not exploit the variance information and that required no stereo training data.