The Scalable Hyperlink Store

Marc Najork

Abstract

This paper describes the Scalable Hyperlink Store, a distributed in-memory "database" for storing large portions of the web graph. SHS is an enabler for research on structural properties of the web graph as well as new link-based ranking algorithms. Previous work on specialized hyperlink databases focused on finding efficient compression algorithms for web graphs. By contrast, this work focuses on the systems issues of building such a database. Specifically, it describes how to build a hyperlink database that is fast, scalable, fault-tolerant, and incrementally updateable.

Details

Publication typeInproceedings
Published in20th ACM Conference on Hypertext and Hypermedia
PublisherAssociation for Computing Machinery, Inc.
> Publications > The Scalable Hyperlink Store