Multiple Intents Re-Ranking
- Yossi Azar ,
- Iftah Gamzu ,
- Xiaoxin Yin
STOC'09 |
Published by Association for Computing Machinery, Inc.
One of the most fundamental problems in web search is how to re-rank result web pages based on user logs. Most traditional models for re-ranking assume each query has a single intent. That is, they assume all users formulating the same query have similar preferences over the result web pages. It is clear that this is not true for a large portion of queries as different users may have different preferences over the result web pages. Accordingly, a more accurate model should assume that queries have multiple intents.
In this paper, we introduce the multiple intents re-ranking problem. This problem captures scenarios in which some user makes a query, and there is no information about its real search intent. In such cases, one would like to re-rank the search results in a way that minimizes the efforts of all users in finding their relevant web pages. More formally, the setting of this problem consists of various types of users, each of which interested in some subset of the search results. Moreover, each user type has a non-negative profile vector. Consider some ordering of the search results. This order sets a position for each search result, and induces a position vector of the results relevant to each user type. The overhead of a user type is the dot product of its profile vector and its induced position vector. The goal is to order the search results as to minimize the average overhead of the users.
Our main result is an O(log r)-approximation algorithm for the problem, where r is the maximum number of search results that are relevant to any user type. The algorithm is based on a new technique, which we call harmonic interpolation. In addition, we consider two important special cases. The first case is when the profile vector of each user type is non-increasing. This case is a generalization of the well-known min-sum set cover problem. We extend the techniques of Feige, Lov´asz and Tetali (Algorithmica ’04), and present an algorithm achieving 4-approximation. The second case is when the profile vector of each user type is non-decreasing. This case generalizes the minimum latency set cover problem, introduced by Hassin and Levin (ESA ’05). We devise an LP-based algorithm that attains 2-approximation for it.
Copyright © 2007 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.