Sayan Bhattacharya, Sreenivas Gollapudi, and Kamesh Munagala
In commerce search, the set of products returned by a search engine often forms the basis for all user interactions leading up to a potential transaction on the web. Such a set of products is known as the consideration set. In this study, we consider the problem of generating consideration set of products in commerce search so as to maximize user satisfaction. One of the key features of commerce search that we exploit in our study is the association of a set of important attributes with the products and a set of specified attributes with the user queries. The attribute space admits a natural definition of user satisfaction via user preferences on the attributes and their values, viz. require that the surfaced products be close to the specified attribute values in the query, and diverse with respect to the unspecified attributes. We model this as a general Max-Sum Dispersion problem wherein we are given a set of n nodes in a metric space and the objective is to select a subset of nodes with total cost at most a given budget, and maximize the sum of the pairwise distances between the selected nodes. In our setting, each node denotes a product, the cost of a node being inversely proportional to its relevance with respect to specified attributes. The distance between two nodes quantifies the diversity with respect to the unspecified attributes. The problem is NP-hard and a 2-approximation was previously known only when all the nodes have unit cost.
In our setting, we do not make any assumptions on the cost. We label this problem as the General Max-Sum Dispersion problem. We give the first constant factor approximation algorithm for this problem, achieving an approximation ratio of 2. Further, we perform extensive empirical analysis on real-world data to show the effectiveness of our algorithm.
In Proc. International Conference on World Wide Web (WWW)
Publisher International World Wide Web Conference