Xiaojie Yuan, Zhicheng Dou, Lu Zhang, and Fang Liu
Understanding the underlying goal behind a user's Web query has been proved to be helpful to improve the quality of search. This paper focuses on the problem of automatic identification of query types according to the goals. Four novel entropy-based features extracted from anchor data and click-through data are proposed, and a SVM classifier is used to identify user goal based on these features. Experimental results show that proposed entropy-based features are more effective than those reported in previous work. By combining multiple features the goals for more than 97% queries studied can be correctly identified. Besides these, this paper gets following important conclusions: First, anchor-based features are more effective than click-through-based features; Second, number of sites is more reliable than number of links; Third, click-distribution-based features are more effective than session-based ones.
|Published in||WISA2008: Proceedings of the 5th Conferences of Web Information System and Application|