HMM-based Strategies for Enhancement of Speech Signals Embedded in Nonstationary Noise

Li Deng

HMM-based Strategies for Enhancement of Speech Signals Embedded in Nonstationary Noise

Li Deng

IEEE Trans. on Speech and Audio Processing | January 1998 , Vol 6: pp. 445-455

Download BibTex

An improved hidden Markov model-based (HMMbased) speech enhancement system designed using the minimum mean square error principle is implemented and compared with a conventional spectral subtraction system. The improvements to the system are: 1) incorporation of mixture components in the HMM for noise in order to handle noise nonstationarity in a more flexible manner, 2) two efficient methods in the speech enhancement system design that make the system realtime implementable, and 3) an adaptation method to the noise type in order to accommodate a wide variety of noises expected under the enhancement system’s operating environment. The results of the experiments designed to evaluate the performance of the HMM-based speech enhancement systems in comparison with spectral subtraction are reported. Three types of noise—white noise, simulated helicopter noise, and multitalker (cocktail party) noise—were used to corrupt the test speech signals. Both objective (global SNR) and subjective mean opinion score (MOS) evaluations demonstrate consistent superiority of the HMM-based enhancement systems that incorporate the innovations described in this paper over the conventional spectral subtraction method.