Block-Level Link Analysis

Deng Cai, Xiaofei He, Ji-Rong Wen, and Wei-Ying Ma

Abstract

Link Analysis has shown great potential in improving the performance of web search. PageRank and HITS are two of the most popular algorithms. Most of the existing link analysis algorithms treat a web page as a single node in the web graph. However, in most cases, a web page contains multiple semantics and hence the web page might not be considered as the atomic node. In this paper, the web page is partitioned into blocks using the visionbased page segmentation algorithm. By extracting the page-toblock, block-to-page relationships from link structure and page layout analysis, we can construct a semantic graph over the WWW such that each node exactly represents a single semantic topic. This graph can better describe the semantic structure of the web. Based on block-level link analysis, we proposed two new algorithms, Block Level PageRank and Block Level HITS, whose performances we study extensively using web data.

Details

Publication typeInproceedings
URLhttp://www.acm.org/
Pages8
NumberMSR-TR-2004-50
InstitutionMicrosoft Research
PublisherAssociation for Computing Machinery, Inc.
> Publications > Block-Level Link Analysis