Zhong Wu, Qifa Ke, Michael Isard, and Jian Sun
June 2009
In state-of-the-art image retrieval systems, an image is
represented by a bag of visual words obtained by quantizing
high-dimensional local image descriptors, and scalable
schemes inspired by text retrieval are then applied for large
scale image indexing and retrieval. Bag-of-words representations,
however: 1) reduce the discriminative power of
image features due to feature quantization; and 2) ignore
geometric relationships among visual words. Exploiting
such geometric constraints, by estimating a 2D affine transformation
between a query image and each candidate image,
has been shown to greatly improve retrieval precision
but at high computational cost. In this paper we present
a novel scheme where image features are bundled into local
groups. Each group of bundled features becomes much
more discriminative than a single feature, and within each
group simple and robust geometric constraints can be efficiently
enforced. Experiments in web image search, with a
database of more than one million images, show that our
scheme achieves a 49% improvement in average precision
over the baseline bag-of-words approach. Retrieval performance
is comparable to existing full geometric verification
approaches while being much less computationally expensive.
When combined with full geometric verification
we achieve a 77% precision improvement over the baseline
bag-of-words approach, and a 24% improvement over full
geometric verification alone.
![]() PDF file |
In: CVPR 2009
Publisher: IEEE
© 2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
http://www.ieee.org/
| Type: | Inproceedings |