Organizing WWW Images Based on the Analysis of Page Layout and Web Link Structure
- Deng Cai ,
- Xiaofei He ,
- Wei-Ying Ma ,
- Ji-Rong Wen ,
- Hong-Jiang Zhang
Published by Institute of Electrical and Electronics Engineers, Inc.
Due to the rapid growth of the number of digital images on the Web, there is an increasing demand for effective and efficient method for organizing and retrieving the images available. This paper describes a method for clustering and embedding WWW images. By using a vision-based page segmentation algorithm, a web page is partitioned into blocks, and the textual and link information of an image can be accurately extracted from the block containing that image. By extracting the page-to-block, block-to-image, block-to-page relationships through link structure and page layout analysis, we construct an image graph. With the image graph model, we use techniques from spectral graph theory for image clustering and embedding. Some experimental results are given in the paper.
© 2004 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.