Jingjing Liu, Xiao Li, Alex Acero, and Ye-Yi Wang
Lexicons are important resources for semantic tagging. However, commonly used lexicons collected from entity databases suffer from multiple problems, such as ambiguity, limited coverage and lack of relative importance. In this work we present a lexicon modeling technique that automatically expands the lexicon and assigns weights to its elements. For lexicon expansion, we use a generative model to extract patterns from query logs using known lexicon seeds, and discover new lexicon elements using the learned patterns. For lexicon weighting, we propose two approaches based on generative and discriminative models to learn the relative importance of lexicon elements from user click statistics. Experiments on text queries in multiple domains show that our lexicon modeling technique can significantly improve semantic tagging performance.