Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
A Generative Modeling Framework for Structured Hidden Speech Dynamics

Li Deng, Dong Yu, and Alex Acero

Abstract

We outline a structured speech model, as a special and perhaps extreme form of probabilistic generative modeling. The model is equipped with long-contextual-span capabilities that are missing in theHMMapproach. Compact (and physically meaningful) parameterization of the model is made possible by the continuity constraint in the hidden vocal tract resonance (VTR) domain. The target-directed VTR dynamics jointly characterize coarticulation and incomplete articulation (reduction). Preliminary evaluation results are presented on the standard TIMIT phonetic recognition task, showing the best result in this task reported in the literature without using many heterogeneous classifier combinations. The pros and cons of our structured generative modeling approach, in comparison with the structured discriminative classification approach, are discussed.

Details

Publication typeInproceedings
Published inNIPS Workshop on Advances in Structured Learning for Text and Speech Processing
PublisherMicrosoft
> Publications > A Generative Modeling Framework for Structured Hidden Speech Dynamics