Daniel Povey and Kaisheng Yao
May 2011
Constrained Maximum Likelihood Linear Regression (CMLLR) is a
widely used speaker adaptation technique in which an affine transform
of the features is estimated for each speaker. However, when
the amount of speech data available is very small (e.g. a few seconds),
it can be difficult to get sufficiently accurate estimates of the
transform parameters. In this paper we describe a method of estimating
CMLLR robustly from less data. We do this by representing the
CMLLR transform matrix as a weighted sum over basis matrices,
where the basis is constructed in such a way that the most important
variation is concentrated in the leading coefficients. Depending on
the amount of data available, we can choose to estimate a smaller or
larger number of coefficients.
![]() PDF file |
In ICASSP
Publisher IEEE
| Type | Inproceedings |