A Fast Re-scoring Strategy to Capture Long-Distance Dependencies

Anoop Deoras, Tomas Mikolov, and Kenneth Church

Abstract

A re-scoring strategy is proposed that makes it feasible to capture more long-distance

dependencies in the natural language. Two pass strategies have become popular in a

number of recognition tasks such as ASR (automatic speech recognition), MT (machine

translation) and OCR (optical character recognition). The first pass typically applies a

weak language model (n-grams) to a lattice and the second pass applies a stronger

language model to N best lists. The stronger language model is intended to capture more

long distance dependencies. The proposed method uses RNN-LM (recurrent neural network

language model), which is a long span LM, to rescore word lattices in the second pass. A

hill climbing method (iterative decoding) is proposed to search over islands of confusability

in the word lattice. An evaluation based on Broadcast News shows speedups of 20 over

basic N best re-scoring, and word error rate reduction of 8% (relative) on a highly

competitive setup.

Details

Publication typeInproceedings
PublisherEmpirical Methods in Natural Language Processing (EMNLP)
> Publications > A Fast Re-scoring Strategy to Capture Long-Distance Dependencies