Daniel Povey, Martin Karafiat, Arnab Ghoshal, and Petr Schwarz
Last year we introduced the Subspace Gaussian Mixture Model (SGMM), and we demonstrated Word Error Rate improvements on a fairly small-scale task. Here we describe an extension to the SGMM, which we call the symmetric SGMM. It makes the model fully symmetric between the “speech-state vectors” and “speaker vectors” by making the mixture weights depend on the speaker as well as the speech state. We had previously avoided this as it introduces difficulties for efficient likelihood evaluation and parameter estimation, but we have found a way to overcome those difficulties. We find that the symmetric SGMM can give a very worthwhile improvement over the previously described model. We will also describe some larger-scale experiments with the SGMM, and report on progress toward releasing open-source software that supports SGMMs.