Scalable Near Identical Image and Shot Detection
Ondřej Chum, James Philbin, Michael Isard and Andrew Zisserman
Proceedings of the International Conference on Image and Video
Retrieval (CIVR) 2007
Abstract
This paper proposes and compares two novel schemes for near duplicate
image and video-shot detection. The first approach is based on global
hierarchical colour histograms, using Locality Sensitive Hashing for
fast retrieval. The second approach uses local feature descriptors
(SIFT) and for retrieval exploits techniques used in the information
retrieval community to compute approximate set intersections between
documents using a min-Hash algorithm. The requirements for
near-duplicate images vary according to the application, and we
address two types of near duplicate definition: (i) being perceptually
identical (e.g. up to noise, discretization effects, small photometric
distortions etc); and (ii) being images of the same 3D scene (so
allowing for viewpoint changes and partial occlusion). We define two
shots to be near-duplicates if they share a large percentage of
near-duplicate frames. We focus primarily on scalability to very large
image and video databases, where fast query processing is
necessary. Both methods are designed so that only a small amount of
data need be stored for each image. In the case of near-duplicate shot
detection it is shown that a weak approximation to histogram matching,
consuming substantially less storage, is sufficient for good
results. We demonstrate our methods on the TRECVID 2006 data set which
contains approximately 165 hours of video (about 17.8M frames with
146K key frames), and also on feature films and pop videos.
Click here for a pdf
version
Back to
Michael Isard's home page