Jun Du and Ren-Hua Wang
In this paper, we propose a new feature normalization approach for robust speech recognition. It is found that the shape of speech feature distributions is changed in noisy environments compared with that in the clean condition. So cepstral shape normalization (CSN) which normalizes the shape of feature distributions is performed by exploiting an exponential factor. This method has been proven effective in noisy environments, especially under low SNRs. Experimental results show that the proposed method yields relative word error rate reductions of 38% and 25% on aurora2 and aurora3 databases, respectively, in comparing with those of the conventional mean and variance normalization (MVN). It is also shown CSN consistently outperforms other traditional methods, such as histogram equalization (HEQ) and higher order cepstral moment normalization (HOCMN).
|Published in||Proc. of ICASSP 2008|
© 2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. http://www.ieee.org/