Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Less is More: Sampling the Neighborhood Graph Makes SALSA Better and Faster

Marc Najork, Sreenivas Gollapudi, and Rina Panigrahy

Abstract

In this paper, we attempt to improve the effectiveness and the efficiency of query-dependent link-based ranking algorithms such as HITS, MAX and SALSA. All these ranking algorithms view the results of a query as nodes in the web graph, expand the result set to include neighboring nodes, and compute scores on the induced neighborhood graph. In previous work it was shown that SALSA in particular is substantially more effective than query-independent link-based ranking algorithms such as PageRank. In this work, we show that whittling down the neighborhood graph through consistent sampling of nodes and edges makes SALSA and its cousins both faster (more efficient) and better (more effective). We offer a hypothesis as to why “less is more”, i.e. why using a reduced graph improves performance.

Details

Publication typeInproceedings
Published in2nd ACM International Conference on Web Search and Data Mining (WSDM)
PublisherAssociation for Computing Machinery, Inc.
> Publications > Less is More: Sampling the Neighborhood Graph Makes SALSA Better and Faster