Yu Tsao, Jinyu Li, and Chin-Hui Lee
We study separation between models of speech attributes. A good measure of separation usually serves as a key indicator of the discrimination power of these speech models because it can often be used to indirectly determine the performance of speech recognition and verification systems. In this study, we use a probabilistic distance, called generalized log likelihood ratio (GLLR), to measure the separation between a model of a target speech attribute and models of its competing attributes. We illustrate five applications to compare separations among models obtained over multiple levels of discrimination capabilities, at various degrees of acoustic definitions and resolutions, under mismatched training and testing conditions, and with different training criteria and speech parameters. We demonstrate that the well-known GLLR distance and its corresponding histograms also provide a good utility to qualitatively and quantitatively characterize the properties of trained models without performing large scale speech recognition and verification experiments.
|Published in||Proc. Interspeech|