Document Selection for Tiered Indexing in Commerce Search

  • Debmalya Panigrahi ,
  • Sreenivas Gollapudi

Proc. of Web Search and Data Mining |

Published by ACM International Conference on Web Search And Data Mining

A search engine aims to return a set of relevant documents in response to a query, while minimizing the response time. This has led to the use of a tiered index, where the search engine maintains a small cache of documents that can serve a large fraction of queries. We give a novel algorithm for the selection of documents in a tiered index for commerce search (i.e. users searching for products on the web) that effectively exploits the superior structural characteristics of commerce search queries. This is in sharp contrast to previous approaches to tiered indexing that were aimed at general web search where queries are typically unstructured. We theoretically analyze our algorithms and give performance guarantees even in worst-case scenarios. We then complement and strengthen our theoretical claims by performing exhaustive experiments on real- world commerce search data, and show that our algorithm outperforms state-of-the-art tiered indexing techniques that were developed for general web search.