A Bayesian Classifier for Spectrographic Mask Estimation for Missing Feature Speech Recognition

Michael Seltzer, B. Raj, and R. Stern

Abstract

Missing feature methods of noise compensation for speech recognition operate by first identifying components of a

spectrographic representation of speech that are considered to be corrupt. Recognition is then performed either using

only the remaining reliable components, or the corrupt components are reconstructedprior to recognition. These methods

require a spectrographic mask which accurately labels the reliable and corrupt regions of the spectrogram. Depending

on the missing feature methodapplied , these masks must either contain binary values or probabilistic values.

Current mask estimation techniques rely on explicit estimation of the characteristics of the corrupting noise. The estimation

process usually assumes that the noise is pseudo-stationary or varies slowly with time. This is a significant drawback

since the missing feature methods themselves have no such restrictions. We present a new mask estimation

technique that uses a Bayesian classifier to determine the reliability of spectrographic elements. Features used for classification

were designed that make no assumptions about the corrupting noise signal, but rather exploit characteristics

of the speech signal itself. Experiments were performedon speech corruptedby a variety of noises, using missing feature

compensation methods which require binary masks and probabilistic masks. In all cases, the proposed Bayesian mask

estimation methodresultedin significantly better recognition accuracy than conventional mask estimation approaches. © 2004 Elsevier B.V. All rights reserved.

Details

Publication typeInproceedings
Published inSpeech communication
> Publications > A Bayesian Classifier for Spectrographic Mask Estimation for Missing Feature Speech Recognition