Wikipedia Pages as Entry Points for Book Search

Marijn Koolen, Gabriella Kazai, and Nick Craswell

Abstract

A lot of the world's knowledge is stored in books, which, as a result

of recent mass-digitisation efforts, are increasingly available online.

Search engines, such as Google Books, provide mechanisms

for searchers to enter this vast knowledge space using queries as

entry points. In this paper, we view Wikipedia as a summary of

this world knowledge and aim to use this resource to guide users to

relevant books. Thus, we investigate possible ways of using Wikipedia

as an intermediary between the user's query and a collection

of books being searched. We experiment with traditional query expansion

techniques, exploiting Wikipedia articles as rich sources

of information that can augment the user's query. We then propose

a novel approach based on link distance in an extended Wikipedia

graph: we associate books with Wikipedia pages that cite these

books and use the link distance between these nodes and the pages

that match the user query as an estimation of a book's relevance to

the query. Our results show that a) classical query expansion using

terms extracted from query pages leads to increased precision, and

b) link distance between query and book pages in Wikipedia provides

a good indicator of relevance that can boost the retrieval score

of relevant books in the result ranking of a book search engine.

Details

Publication typeInproceedings
Published inProceedings of the Second ACM International Conference on Web Search and Data Mining (WSDM'09)
PublisherAssociation for Computing Machinery, Inc.
> Publications > Wikipedia Pages as Entry Points for Book Search