Gang Yu, Junsong Yuan, and Zicheng Liu
28 November 2011
Many existing techniques in content based video retrieval
treat a video sequence as a whole to match it against a query
video or to assign a text label. Such an approach has serious
limitations when applied to human action retrieval because
an action may occur only in a sub-region and last for a small
portion of the video length. In situations like this, we essen-
tially need to match the subvolumes of the video sequences
against the query video. A naive exhaustive search is im-
practical due to large number of possible subvolumes for each
video sequence. In this paper, we propose a novel framework
for action retrieval which performs pattern matching at sub-
volume level and is very efficient in handling large corpus of
videos. We construct an unsupervised random forest to in-
dex the video database, generate a score volume with Hough
voting and then employ a max sub-path strategy to quickly
search for the temporal and spatial positions of all the video
sequences in the database. We present action search experi-
ments on challenging datasets to validate the efficiency and
effectiveness of our system.
![]() PDF file |
In Multimeda (ACMMM)
Publisher ACM
© 2012 ACM. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the ACM.
| Type | Proceedings |