A Large Scale Ranker-Based System for Search Query Spelling Correction

This paper makes three significant extensions to a noisy channel speller designed for standard writ-ten text to target the challenging domain of search queries. First, the noisy channel model is sub-sumed by a more general ranker, which allows a variety of features to be easily incorporated. Se-cond, a distributed infrastructure is proposed for training and applying Web scale n-gram language models. Third, a new phrase-based error model is presented. This model places a probability dis-tribution over transformations between mul-ti-word phrases, and is estimated using large amounts of query-correction pairs derived from search logs. Experiments show that each of these extensions leads to significant improvements over the state-of-the-art baseline methods.

coling2010 speller v5 name.pdf
PDF file

In  The 23rd International Conference on Computational Linguistics

Publisher  International Conference on Computational Linguistics
Copyright COLING 2010

Details

TypeInproceedings
> Publications > A Large Scale Ranker-Based System for Search Query Spelling Correction