Subspace Gaussian Mixture Models for Speech Recognition

This technical report contains the details of an acoustic modeling approach based on subspace adaptation of a shared Gaussian Mixture Model. This refers to adaptation to a particular speech state; it is not a speaker adaptation technique, although we do later introduce a speaker adaptation technique that it tied to this particular framework. Our model is a large shared GMM whose parameters vary in a subspace of relatively low dimension (e.g. 50), thus each state is described by a vector of low dimension which controls the GMM's means and mixture weights in a manner determined by globally shared parameters. In addition we generalize to having each speech state be a mixture of substates, each with a different vector. Only the mathematical details are provided here; experimental results are being published separately.

ubmdoc.pdf
PDF file

Publisher  Microsoft
© 2008 Microsoft Corporation. All rights reserved.

Details

TypeTechReport
NumberMSR-TR-2009-64
> Publications > Subspace Gaussian Mixture Models for Speech Recognition