Empirical Evaluation and Combination of Advanced Language Modeling Techniques

Tomas Mikolov, Anoop Deoras, Stefan Kombrink, Lukas Burget, and Jan Honza Cernocky

Abstract

We present results obtained with several advanced language

modeling techniques, including class based model, cache

model, maximum entropy model, structured language model,

random forest language model and several types of neural network

based language models. We show results obtained after

combining all these models by using linear interpolation. We

conclude that for both small and moderately sized tasks, we obtain

new state of the art results with combination of models,

that is significantly better than performance of any individual

model. Obtained perplexity reductions against Good-Turing trigram

baseline are over 50% and against modified Kneser-Ney

smoothed 5-gram over 40%.

Details

Publication typeInproceedings
Published inInterspeech
PublisherISCA
> Publications > Empirical Evaluation and Combination of Advanced Language Modeling Techniques