Zhi-Jie Yan, Frank K. Soong, and Ren-Hua Wang
15 April 2007
This paper presents a word graph based feature enhancement method for robust speech recognition in noise. The approach uses signal processing based speech enhancement as a starting point, and then performs Wiener filtering to remove residual noise. During the process, a decoded word graph is used to directly guide the feature enhancement with respect to the HMM for recognition, so that the enhanced feature can match the clean speech model better in the acoustic space. The proposed word graph based feature enhancement method was tested on the Aurora 2 database. Experimental results show that an improved recognition performance can be obtained comparing with conventional signal processing based and GMM based feature enhancement methods. With signal processing based weighted noise estimation and GMM based method, the relative error rate reductions are 35.44% and 42.58%, respectively. The proposed word graph based method improves the performance further, and a relative error rate reduction of 57.89% is obtained.
In IEEE International Conference on Acoustics, Speech and Signal Processing, 2007, ICASSP 2007
© 2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. http://www.ieee.org/