Front-End, Back-End, and Hybrid Techniques to Noise-Robust Speech Recognition

  • Li Deng

Chapter 4, in D. Kolossa and R. Hab-Umbach (eds.) Robust Speech Recognition of Uncertain Data

Published by Springer Verlag | 2011 | D. Kolossa and R. Hab-Umbach (eds.) Robust Speech Recognition of Uncertain Data edition

Publication

Noise robustness has long been an active area of research that captures significant interest from speech recognition researchers and developers. In this chapter, with a focus on the problem of uncertainty handling in robust speech recognition, we use the Bayesian framework as a common thread for connecting, analyzing, and categorizing a number of popular approaches to the solutions pursued in the recent past. The topics covered in this chapter include 1) Bayesian decision rules with unreliable features and unreliable model parameters; 2) principled ways of computing feature uncertainty using structured speech distortion models; 3) use of a phase factor in an advanced speech distortion model for feature compensation; 4) a novel perspective on model compensation as a special implementation of the general Bayesian predictive classification rule capitalizing on model parameter uncertainty; 5) taxonomy of noise compensation techniques using two distinct axes, feature vs. model domain and structured vs. unstructured transformation; and 6) noise-adaptive training as a hybrid feature-model compensation framework and its various forms of extension.