Gang Yu, Junsong Yuan, and Zicheng Liu
20 June 2011
Despite recent successes of searching small object in images, it remains a challenging problem to search and locate actions in crowded videos because of (1) the large variations of human actions and (2) the intensive computational cost of searching the video space. To address these challenges, we propose a fast action search and localization method that supports relevance feedback from the user. By characterizing videos as spatio-temporal interest points and building a random forest to index and match these points, our query matching is robust and efficient. To enable efficient action localization, we propose a coarse-to-fine subvolume search scheme, which is several orders faster than the existing video branch and bound search. The challenging cross-dataset search of several actions validates the effectiveness and efficiency of our method.
|Publisher||IEEE International Conference on Computer Vision and Pattern Recognition (CVPR)|
© 2012 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.