Variational Approximation of Long-Span Language Models for LVCSR

Anoop Deoras, Tomas Mikolov, Stefan Kombrink, Martin Karafiat, and Sanjeev Khudanpur


Long-span language models that capture syntax and semantics are seldom used in the first

pass of large vocabulary continuous speech recognition systems due to the prohibitive

search-space of sentencehypotheses. Instead, an N-best list of hypotheses is created

using tractable n-gram models, and rescored using the long-span models. It is shown in

this paper that computationally tractable variational approximations of the long-span

models are a better choice than standard n-gram models for first pass decoding. They not

only result in a better first pass output, but also produce a lattice with a lower oracle word

error rate, and rescoring the N-best list from such lattices with the long-span models

requires a smaller N to attain the same accuracy. Empirical results on the WSJ, MIT

Lectures, NIST 2007 Meeting Recognition and NIST 2001 Conversational Telephone

Recognition data sets are presented to support these claims.


Publication typeInproceedings
PublisherIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
> Publications > Variational Approximation of Long-Span Language Models for LVCSR