Multiple Intents Re-Ranking

  • Yossi Azar ,
  • Iftah Gamzu ,
  • Xiaoxin Yin

STOC'09 |

Published by Association for Computing Machinery, Inc.

One of the most fundamental problems in web search is how to re-rank result web pages based on user logs. Most traditional models for re-ranking assume each query has a single intent. That is, they assume all users formulating the same query have similar preferences over the result web pages. It is clear that this is not true for a large portion of queries as different users may have different preferences over the result web pages. Accordingly, a more accurate model should assume that queries have multiple intents.

In this paper, we introduce the multiple intents re-ranking problem. This problem captures scenarios in which some user makes a query, and there is no information about its real search intent. In such cases, one would like to re-rank the search results in a way that minimizes the efforts of all users in finding their relevant web pages. More formally, the setting of this problem consists of various types of users, each of which interested in some subset of the search results. Moreover, each user type has a non-negative profile vector. Consider some ordering of the search results. This order sets a position for each search result, and induces a position vector of the results relevant to each user type. The overhead of a user type is the dot product of its profile vector and its induced position vector. The goal is to order the search results as to minimize the average overhead of the users.

Our main result is an O(log r)-approximation algorithm for the problem, where r is the maximum number of search results that are relevant to any user type. The algorithm is based on a new technique, which we call harmonic interpolation. In addition, we consider two important special cases. The first case is when the profile vector of each user type is non-increasing. This case is a generalization of the well-known min-sum set cover problem. We extend the techniques of Feige, Lov´asz and Tetali (Algorithmica ’04), and present an algorithm achieving 4-approximation. The second case is when the profile vector of each user type is non-decreasing. This case generalizes the minimum latency set cover problem, introduced by Hassin and Levin (ESA ’05). We devise an LP-based algorithm that attains 2-approximation for it.