Junsong Yuan, Zicheng Liu, and Ying Wu
2011
Actions are spatiotemporal patterns. Similar to the sliding window-based object detection, action detection finds the
reoccurrences of such spatiotemporal patterns through pattern matching, by handling cluttered and dynamic backgrounds and other
types of action variations. We address two critical issues in pattern matching-based action detection: 1) the intrapattern variations in
actions, and 2) the computational efficiency in performing action pattern search in cluttered scenes. First, we propose a discriminative
pattern matching criterion for action classification, called naive Bayes mutual information maximization (NBMIM). Each action is
characterized by a collection of spatiotemporal invariant features and we match it with an action class by measuring the mutual
information between them. Based on this matching criterion, action detection is to localize a subvolume in the volumetric video space
that has the maximum mutual information toward a specific action class. A novel spatiotemporal branch-and-bound (STBB) search
algorithm is designed to efficiently find the optimal solution. Our proposed action detection method does not rely on the results of
human detection, tracking, or background subtraction. It can handle action variations such as performing speed and style variations as
well as scale changes well. It is also insensitive to dynamic and cluttered backgrounds and even to partial occlusions. The cross-data
set experiments on action detection, including KTH, CMU action data sets, and another new MSR action data set, demonstrate the
effectiveness and efficiency of the proposed multiclass multiple-instance action detection method.
![]() PDF file |
In IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (TPAMI)
| Type | Article |