Similarity Search using Concept Graphs

Rakesh Agrawal, Sreenivas Gollapudi, Anitha Kannan, and Krishnaram Kenthapadi

Abstract

The rapid proliferation of hand-held devices has led to the development of rich, interactive and immersive applications, such as e-readers for electronic books. These applications motivate retrieval systems that can implicitly satisfy any information need of the reader by exploiting the context of the user’s interactions. Such retrieval systems differ from traditional search engines in that the queries constructed using the context are typically complex objects (including the document and its structure).

In this paper, we develop an efficient retrieval system, only assuming an oracle access to a traditional search engine that admits ‘succinct’ keyword queries for retrieving objects of a desired media type. As part of query generation, we first map the complex query object to a concept graph and then use the concepts along with their relationships in the graph to compute a small set of keyword queries to the search engine. Next, as part of the result generation, we aggregate the results of these queries to identify relevant web content of the desired type, thereby eliminating the need for explicitly computing similarity between the query object and all web content. We present a theoretical analysis of our approach and carry out a detailed empirical evaluation to show the practicality of the approach for the task of augmenting electronic documents with high quality videos from the web.

Details

Publication typeInproceedings
Published inInternational Conference on Information and Knowledge Management (CIKM)
PublisherACM – Association for Computing Machinery
> Publications > Similarity Search using Concept Graphs