Interval-Based Pruning for Top-k Processing over Compressed Lists

Optimizing execution of top-k queries over recordid

ordered, compressed lists is challenging. The threshold family

of algorithms cannot be effectively used in such cases. Yet,

improving execution of such queries is of great value. For

example, top-k keyword search in information retrieval (IR)

engines represents an important scenario where such optimization

can be directly beneficial. In this paper, we develop novel

algorithms to improve execution of such queries over state

of the art techniques. Our main insights are pruning based

on fine-granularity bounds and traversing the lists based on

judiciously chosen “intervals” rather than individual records.

We formally study the optimality characteristics of the proposed

algorithms. Our algorithms require minimal changes and can be

easily integrated into IR engines. Our experiments on real-life

datasets show that our algorithm outperform the state of the art

techniques by a factor of 3-6 in terms of query execution times.

PDF file

In  ICDE Conference

Publisher  IEEE


> Publications > Interval-Based Pruning for Top-k Processing over Compressed Lists