Speaker recognition with region-constrained MLLR transforms

Andreas Stolcke, Arindam Mandal, and Elizabeth Shriberg


It has been shown that standard cepstral speaker recognition models

can be enhanced by region-constrained models, where features

are extracted only from certain speech regions defined by linguistic

or prosodic criteria. Such region-constrained models can capture

features that are more stable, highly idiosyncratic, or simply complementary

to the baseline system. In this paper we ask if another major

class of speaker recognition models, those based on MLLR speaker

adaptation transforms, can also benefit from region-constrained feature

extraction. In our approach, we define regions based on phonetic

and prosodic criteria, based on automatic speech recognition

output, and performMLLR estimation using only frames selected by

these criteria. The resulting transform features are appended to those

of a state-of-the-art MLLR speaker recognition system and jointly

modeled by SVMs. Multiple regions can be added in this fashion.

We find consistent gains over the baseline system in the SRE2010

speaker verification task.


Publication typeInproceedings
Published inProceedings of IEEE ICASSP
PublisherIEEE SPS
> Publications > Speaker recognition with region-constrained MLLR transforms