Speaker-adaptive learning of resonance targets in a hidden trajectory model of speech coarticulation

Dong Yu, Li Deng, and Alex Acero

Abstract

A novel speaker-adaptive learning algorithm is developed and evaluated for a hidden trajectory model of speech coarticulation and reduction. Central to this model is the process of bi-directional (forward and backward) filtering of the vocal tract resonance (VTR) target sequence. The VTR targets are key parameters of the model that control the hidden VTR’s dynamic behavior and the subsequent acoustic properties (those of the cepstral vector sequence). We describe two techniques for training these target parameters: (1) speaker-independent training that averages out the target variability over all speakers in the training set; and (2) speaker-adaptive training that takes into account the variability in the target values among individual speakers. The adaptive learning is applied also to adjust each unknown test speaker’s target values towards their true values. All the learning algorithms make use of the results of accurate VTR tracking as developed in our earlier work. In this paper, we present details of the learning algorithms and the analysis results comparing speaker-independent and speaker-adaptive learning. We also describe TIMIT phone recognition experiments and results, demonstrating consistent superiority of speaker adaptive learning over speaker-independent one measured by the phonetic recognition performance.

Details

Publication typeArticle
Published inComputer Speech and Language
Pages72-87
Volume27
PublisherElsevier
> Publications > Speaker-adaptive learning of resonance targets in a hidden trajectory model of speech coarticulation