Discounted likelihood linear regression for rapid speaker adaptation

The widely used maximum likelihood linear regression speaker adaptation procedure suffers from overtraining when used for rapid adaptation tasks in which the amount of adaptation data is severely limited. This is a well known difficulty associated with the estimation maximization algorithm. We use an information geometric analysis of the estimation maximization algorithm as an alternating minimization of a Kullback-Leibler-type divergence to see the cause of this difficulty, and propose a more robust discounted likelihood estimation procedure. This gives rise to a discounted likelihood linear regression procedure, which is a variant of maximum likelihood linear regression suited for small adaptation sets. Our procedure is evaluated on an unsupervised rapid adaptation task defined on the Switchboard conversational telephone speech corpus, where our proposed procedure improves word error rate by 1.6% (absolute) with as little as five seconds of adaptation data, which is a situation in which maximum likelihood linear regression overtrains in the first iteration of adaptation. We compare several realizations of discounted likelihood linear regression with maximum likelihood linear regression and other simple maximum likelihood linear regression variants, and discuss issues that arise in implementing our discounted likelihood procedures.

gunawardana01__discoun_likel_linear_regres_rapid_speak_adapt.pdf
PDF file
gunawardana01__discoun_likel_linear_regres_rapid_speak_adapt.ps
PostScript file

In  Computer Speech and Language

Publisher  Elsevier
Copyright © 2005 Elsevier B.V. All rights reserved.

Details

TypeArticle
URLhttp://www.elsevier.com/wps/find/homepage.cws_home
Pages15--38
Volume15
Number1
> Publications > Discounted likelihood linear regression for rapid speaker adaptation