Share this page
Share this page E-mail this page Print this page RSS feeds
Home > Publications > Book Search Experiments: Investigating IR Methods for the Indexing and Retrieval of Books
Book Search Experiments: Investigating IR Methods for the Indexing and Retrieval of Books

Through mass-digitization projects and with the use of OCR

technologies, digitized books are becoming available on the Web and in

digital libraries. The unprecedented scale of these efforts, the unique

characteristics of the digitized material as well as the unexplored possibilities

of user interactions make full-text book search an exciting area of

information retrieval (IR) research. Emerging research questions include:

How appropriate and effective are traditional IR models when applied to

books? What book specific features (e.g., back-of-book index) should receive

special attention during the indexing and retrieval processes? How

can we tackle scalability? In order to answer such questions, we developed

an experimental platform to facilitate rapid prototyping of a book

search system as well as to support large-scale tests. Using this system,

we performed experiments on a collection of 10 000 books, evaluating the

efficiency of a novel multi-field inverted index and the effectiveness of the

BM25F retrieval model adapted to books, using book-specific fields.

In: Advances in Information Retrieval, 30th European Conference on IR Research, ECIR 2008

Publisher: Springer

Details

Type: Inproceedings
Pages: 234-245
Volume: 4956
Series: Lecture Notes in Computer Science
ISBN: 978-3-540-78645-0