Geoffrey Zweig and Shuangyu Chang
Model M is a recently proposed class based exponential n-gram language model. In this paper, we extend it with personalization features, address the scalability issues present with large data sets, and test its effectiveness on the Bing Mobile voice-search task. We find that Model M by itself reduces both perplexity and word error rate compared with a conventional model, and that the personalization features produce a further significant improvement. The personalization features provide a very large improvement when the history contains a relevant query; thus the overall effect is gated by the number of times a user requeries a past request.
Publisher International Speech Communication Association