H. Attias, Li Deng, Alex Acero, and John Platt
September 2001
We present a new method for speech denoising and robust
speech recognition. Using the framework of probabilistic models
allows us to integrate detailed speech models and models
of realistic non-stationary noise signals in a principled manner.
The framework transforms the denoising problem into a problem
of Bayes-optimal signal estimation, producing minimum mean
square error estimators of desired features of clean speech from
noisy data. We describe a fast and efficient implementation of
an algorithm that computes these estimators. The effectiveness
of this algorithm is demonstrated in robust speech recognition
experiments, using the Wall Street Journal speech corpus and
Microsoft Whisper large-vocabulary continuous speech recognizer.
Results show significantly lower word error rates than
those under noisy-matched condition. In particular, when the
denoising algorithm is applied to the noisy training data and
subsequently the recognizer is retrained, very low error rates are
obtained.
![]() PDF file |
In Proc. of the Eurospeech Conference
| Type | Inproceedings |