Hua Wen, Yangqiu Song, Haixun Wang, and Xiaofang Zhou
A search task represents an atomic information need of a user in Web search. Tasks consist of queries and their reformulations, and identifying tasks is important to search engines since they provide valuable information for determining user satisfaction of search results, predicting user search intent, and suggesting queries to the user. Traditional approaches of identifying tasks exploit either temporal or lexical features of queries. However, a lot of query refinements are topical, which means that a query and its refinements may not be similar on the lexical level. Furthermore, multiple tasks in the same search session may interleave, which means we cannot simply order the searches by their timestamps and divide the session into multiple tasks. Thus, in order to identify tasks correctly, we need to be able to compare two queries on the semantic level. In this paper, we use a knowledgebase known as Probase to infer the conceptual meanings of queries, and automatically identify the topical query refinements in tasks. Experimental results on real search log data demonstrate that Probase can indeed help estimating the topical affinity between queries, and thus enable us to merge queries that are topically related but dissimilar on the lexical level.