Learning Statistically Characterized Resonance Targets in a Hidden Trajectory Model of Speech Coarticulation and Reduction

We report our new development of a hidden trajectory model for co-articulated, time-varying patterns of speech. The model uses bi-directional filtering of vocal tract resonance targets to jointly represent contextual variation and phonetic reduction in speech acoustics. A novel maximum-likelihood-based learning algorithm is presented that accurately estimates the distributional parameters of the resonance targets. The results of the estimates are analyzed and shown to be consistent with all the relevant acoustic-phonetic facts and intuitions. Phonetic recognition experiments demonstrate that the model with more rigorous target training outperforms the most recent earlier version of the model, producing 17.5% fewer errors in N-best rescoring.

2005-deng-eurospeech.pdf
PDF file

In  Proc. of the Interspeech Conference

Publisher  International Speech Communication Association
© 2007 ISCA. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the ISCA and/or the author.

Details

TypeInproceedings
> Publications > Learning Statistically Characterized Resonance Targets in a Hidden Trajectory Model of Speech Coarticulation and Reduction