A Lattice Search Technique for Long-contextual-span Hidden Trajectory Model of Speech

We have recently developed a long-contextual-span hidden trajectory model (HTM) which captures underlying

dynamic structure of speech coarticulation and reduction. Due to the long-span nature of the HTM and the complexity

of its likelihood score computation, N-best list rescoring was the principal paradigm for evaluating the HTM for phonetic

recognition in our earlier work. In this paper, we describe improved likelihood score computation in the HTM and a novel

A*-based time-asynchronous lattice-constrained decoding algorithm for the HTM evaluation. We focus on several special

considerations in the decoder design, which are necessitated by the dependency of the HTM score at each given frame on

the model parameters associated with a variable number of adjacent past and future phones. We present details on how the

nodes and links in the lattices are expanded via a look-ahead mechanism, on how the A* heuristics are estimated, and on

how pruning strategies are applied to speed up the search process. The experiments on the standard TIMIT phonetic

recognition task show improvement of recognition accuracy by the new search algorithm on recognition lattices over

the traditional N-best rescoring paradigm.

2006-dongyu-spcom.pdf
PDF file

In  Speech Communication

Publisher  Elsevier
Copyright © 2007 Elsevier B.V. All rights reserved.

Details

TypeArticle
Volume48
Number9
> Publications > A Lattice Search Technique for Long-contextual-span Hidden Trajectory Model of Speech