Evaluation of a Long-Contextual-Span Hidden Trajectory Model and Phonetic Recognizer Using A* Lattice Search

  • Dong Yu ,
  • Li Deng ,
  • Alex Acero

Proc. of the Interspeech Conference |

Published by International Speech Communication Association

A long-contextual-span Hidden Trajectory Model (HTM) developed recently captures underlying dynamic structure of speech coarticulation and reduction using a highly compact set of context-independent parameters. However, the longspan nature of the HTM makes it difficult to develop efficient search algorithms for its full evaluation. In this paper, we describe our initial effort in meeting this challenge. The basic search algorithm is time-asynchronous A*. Given the structural complexity of the long-span HTM, special considerations are needed to take into account the fact that the HTM score for each frame depends on the model parameters associated with a variable number of adjacent phones. Specifically, we present details on how the nodes and links in the lattices are expanded via look-ahead, how the A* heuristics are estimated, and what pruning strategies are applied to speed up the search. The experiments on TIMIT phonetic recognition show the capability of our newly developed lattice search algorithm in evaluating billions of hypotheses based on long-span HTM scores. The results significantly extend our earlier work from N-best rescoring to A* search over lattices.