A unified spectral transformation adaptation approach for robust speech recognition

In this paper, Canonical Correlation Based Compensation(CCBC) is proposed as an unified approach to cope with the mismatch between training and test set. The mismatch between training and test conditions can be simply clustered into three classes: differences of speakers, changes of recording channel and effects of noisy environment. In previous work, we had used CCBC approach with some modifications to make our speech recognizer robust to the noisy environment successfully[1]. Recently, the same approach has been extended for speaker and channel adaptation. The results of our experiments show that CCBC approach well compensated all three kinds of distortion source between training and test conditions. In order to compare the performance of CCBC with that of some conventional adaptation approaches, the capacities of the techniques of cepstral mean normalization, RASTA and Lin-Log RASTA are tested. We find that CCBC has better performance than them. As an very important problem in CCBC approach, the selection of appropriate reference speech data is also discussed in this paper.

PDF file


Publisher  International Speech Communication Association
© 2007 ISCA. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the ISCA and/or the author.


> Publications > A unified spectral transformation adaptation approach for robust speech recognition