Noise Adaptive Training Using a Vector Taylor Series Approach for Robust Automatic Speech Recognition

In traditional methods for noise robust automatic speech recognition,

the acoustic models are typically trained using clean speech or

using multi-condition data that is processed by the same feature enhancement

algorithm expected to be used in decoding. In this paper,

we propose a noise adaptive training (NAT) algorithm that can be

applied to all training data that normalizes the environmental distortion

as part of the model training. In contrast to the feature enhancement

methods, NAT estimates the underlying “pseudo-clean” model

parameters directly without relying on point estimates of the clean

speech features as an intermediate step. The pseudo-clean model parameters

learned with NAT are later used with vector Taylor series

(VTS) model adaptation for decoding noisy utterances at test time.

Experiments performed on the Aurora 2 and Aurora 3 tasks, demonstrate

that the proposed NAT method obtain relative improvements

of 18.83% and 32.02%, respectively, over VTS model adaptation.

Ozlem_ICASSP09_final.pdf
PDF file

In  Proceedings of International Conference on Acoustics, Speech, and Signal Processing

Publisher  Institute of Electrical and Electronics Engineers, Inc.
© 2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Details

TypeInproceedings
AddressTaipei, Taiwan
> Publications > Noise Adaptive Training Using a Vector Taylor Series Approach for Robust Automatic Speech Recognition