Xiong Xiao, Jinyu Li, and et. al
In this paper, we propose a novel feature space adaptation technique
to improve the robustness of speech recognition in noisy environ-
ments. Histogram equalization (HEQ) is an effective technique for
improving robustness by reducing the difference between clean and
noisy features. A weakness of HEQ is that it does not take into ac-
count acoustic model, resulting in possible mismatch between HEQ-
processed features and the acoustic model. In this paper, we propose
to adapt HEQ to maximize the likelihood of HEQ-processed features
on the acoustic model, with a constraint on the parameters of HEQ.
In addition, we use a Gaussian mixture model (GMM) to represent
the clean feature space rather than using the acoustic model itself,
and this results in both simpler implementation and better results.
Experimental results show that HEQ with adaptation reduces word
error rate by 7.5% and 5.7% respectively on Aurora-2 and Auroar-4
tasks over the HEQ baseline without adaptation.
|Published in||Proc. ICASSP|