Factorized adaptation for deep neural network

Jinyu Li; Jui-Ting Huang; Yifan Gong

Factorized adaptation for deep neural network

Jinyu Li ,
Jui-Ting Huang ,
Yifan Gong

ICASSP | January 2014

Download BibTex

In this paper, we propose a novel method to adapt context-dependent deep neural network hidden Markov model (CD-DNN-HMM) with only limited number of parameters by taking into account the underlying factors that contribute to the distorted speech signal. We derive this factorized adaptation method from the perspectives of joint factor analysis and vector Taylor series expansion, respectively. Evaluated on Aurora 4, the proposed method can get 19.0% and 10.6% relative word error rate reduction on test set B and D with only 20 adaptation utterances, and can have decent improvement with as few as two adaptation utterances. We also show that the proposed method is better than feature discriminative linear regression (fDLR), an existing DNN adaptation method. Its small number of parameters and short training time offer an attractive solution to low-footprint speech applications.