Youngho Kim, Ahmed Hassan, Ryen White, and Yi-Min Wang
Understanding the characteristics of queries where a search engine is failing is important for improving engine performance. Previous work largely relies on user-interaction features (e.g., clickthrough statistics) to identify such underperforming queries. However, re-lying on interaction behavior means that searchers need to become dissatisfied and need to exhibit that in their search behavior, by which point it may be too late to help them. In this paper, we pro-pose a method to generate underperforming query identification rules instantly using topical and lexical attributes. The method first generates query attributes using sources such as topics, concepts (entities), and keywords in queries. Then, association rules are learned by exploiting the FP-growth algorithm and decision trees using underperforming query examples. We develop a query clas-sification model capable of accurately estimating dissatisfaction us-ing the generated rules, and demonstrate significant performance gains over state-of-the-art query performance prediction models.
Publisher ACM International Conference on Web Search And Data Mining