Approximate Inference: A Sampling Based Modeling Technique to Capture Complex Dependencies in a Language Model

In this paper, we present strategies to incorporate long context information directly during the first pass decoding and also for the

second pass lattice re-scoring in speech recognition systems. Long-span language models that capture complex syntactic and/or semantic

information are seldom used in the first pass of large vocabulary continuous speech recognition systems due to the prohibitive increase in

the size of the sentence-hypotheses search space. Typically, n-gram language models are used in the first pass to produce N-best lists,

which are then re-scored using long-span models. Such a pipeline produces biased first pass output, resulting in sub-optimal performance

during re-scoring. In this paper we show that computationally tractable variational approximations of the long-span and complex language

models are a better choice than the standard n-gram model for the first pass decoding and also for lattice re-scoring.

SPECOM-2012.pdf
PDF file

In  Elsevier Speech Communication

Publisher  Elsevier

Details

TypeArticle
> Publications > Approximate Inference: A Sampling Based Modeling Technique to Capture Complex Dependencies in a Language Model