Jinyu Li, Michael L. Seltzer, and Yifan Gong
By exploiting a model of environmental distortion, model adaptation based on vector Taylor series (VTS) approaches have been shown to significantly improve the robustness of speech recognizers to environmental noise. However, the computational cost of VTS model adaptation (MVTS) methods hinders them from being more widely used. In this paper, we propose to reduce the computational cost of MVTS by replacing the Jacobian matrix used in the vector Taylor series approximation with a diagonal Jacobian matrix (DJVTS). We verify this approximation by showing that the Jacobian matrices are dominated by their diagonal elements and therefore the model distortion introduced by this approximation is very small. DJVTS gives similar accuracy as the standard MVTS method with significant reduction in computational cost. The proposed method also achieves higher accuracy than VTS-based feature enhancement.