Subspace Gaussian Mixture Models for Speech Recognition

Daniel Povey

Abstract

This technical report contains the details of an acoustic modeling approach based on subspace adaptation of a shared Gaussian Mixture Model. This refers to adaptation to a particular speech state; it is not a speaker adaptation technique, although we do later introduce a speaker adaptation technique that it tied to this particular framework. Our model is a large shared GMM whose parameters vary in a subspace of relatively low dimension (e.g. 50), thus each state is described by a vector of low dimension which controls the GMM's means and mixture weights in a manner determined by globally shared parameters. In addition we generalize to having each speech state be a mixture of substates, each with a different vector. Only the mathematical details are provided here; experimental results are being published separately.

Details

Publication typeTechReport
NumberMSR-TR-2009-64
PublisherMicrosoft
> Publications > Subspace Gaussian Mixture Models for Speech Recognition