Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Building bridges for web query classification

Dou Shen, Jian-Tao Sun, Qiang Yang, and Zheng Chen

Abstract

Web query classification (QC) aims to classify Web users' queries, which are often short and ambiguous, into a set of target categories. QC has many applications including page ranking in Web search, targeted advertisement in response to queries, and personalization. In this paper, we present a novel approach for QC that outperforms the winning solution of the ACM KDDCUP 2005 competition, whose objective is to classify 800,000 real user queries. In our approach, we first build a bridging classifier on an intermediate taxonomy in an offline mode. This classifier is then used in an online mode to map user queries to the target categories via the above intermediate taxonomy. A major innovation is that by leveraging the similarity distribution over the intermediate taxonomy, we do not need to retrain a new classifier for each new set of target categories, and therefore the bridging classifier needs to be trained only once. In addition, we introduce category selection as a new method for narrowing down the scope of the intermediate taxonomy based on which we classify the queries. Category selection can improve both efficiency and effectiveness of the online classification. By combining our algorithm with the winning solution of KDDCUP 2005, we made an improvement by 9.7% and 3.8% in terms of precision and F1 respectively compared with the best results of KDDCUP 2005.

Details

Publication typeInproceedings
Published inSIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
URLhttp://doi.acm.org/10.1145/1148170.1148196
Pages131–138
ISBN1-59593-369-7
AddressNew York, NY, USA
PublisherACM
> Publications > Building bridges for web query classification