Kaushik Chakrabarti, Surajit Chaudhuri, and Venkatesh Ganti
April 2011
Optimizing execution of top-k queries over recordid
ordered, compressed lists is challenging. The threshold family
of algorithms cannot be effectively used in such cases. Yet,
improving execution of such queries is of great value. For
example, top-k keyword search in information retrieval (IR)
engines represents an important scenario where such optimization
can be directly beneficial. In this paper, we develop novel
algorithms to improve execution of such queries over state
of the art techniques. Our main insights are pruning based
on fine-granularity bounds and traversing the lists based on
judiciously chosen “intervals” rather than individual records.
We formally study the optimality characteristics of the proposed
algorithms. Our algorithms require minimal changes and can be
easily integrated into IR engines. Our experiments on real-life
datasets show that our algorithm outperform the state of the art
techniques by a factor of 3-6 in terms of query execution times.
![]() PDF file |
In ICDE Conference
Publisher IEEE
| Type | Inproceedings |