Kaushik Chakrabarti, Venkatesh Ganti, Jiawei Han, and Dong Xin
In many document collections, documents are related to objects
such as document authors, products described in the document, or
persons referred to in the document. In many applications, the goal
is to find such related objects that best match a set of keywords.
The keywords may not necessarily occur in the textual descriptions
of target objects; they occur only in the documents. In order to
answer these queries, we exploit the relationships between the
documents containing the keywords and the target objects related
to those documents. Current keyword query paradigms do not use
these relationships effectively and hence are inefficient for these
In this paper, we consider a class of queries called the
“object finder” queries. Our goal is to return the top K objects that
best match a given set of keywords by exploiting the relationships
between documents and objects. We design efficient algorithms by
developing early termination strategies in presence of blocking
operators such as group by. Our experiments with real datasets
and workloads demonstrate the effectiveness of our techniques.
Although we present our techniques in the context of keyword
search, our techniques apply to other types of ranked searches
(e.g., multimedia search) as well.
|Published in||SIGMOD Conference|