Oliver Williams, Andrew Blake, and Roberto Cipolla
This paper extends the use of statistical learning algorithms for object localization. It has been shown that object recognizers using kernel-SVMs can be elegantly adapted to localization by means of spatial perturbation of the SVM. While this SVM applies to each frame of a video independently of other frames, the benefits of temporal fusion of data are well-known. This is addressed here by using a fully probabilistic Relevance Vector Machine (RVM) to generate observations with Gaussian distributions that can be fused over time.
Rather than adapting a recognizer, we build a displacement expert which directly estimates displacement from the target region. An object detector is used in tandem, for object verification, providing the capability for automatic initialization and recovery. This approach is demonstrated in real-time tracking systems where the sparsity of the RVM means that only a fraction of CPU time is required to track at frame rate. An experimental evaluation compares this approach to the state of the art showing it to be a viable method for long-term region tracking.
In IEEE Transactions on Pattern Analysis and Machine Intelligence
Publisher IEEE Computer Society
Copyright © 2007 IEEE. Reprinted from IEEE Computer Society. This material is posted here with permission of the IEEE. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to firstname.lastname@example.org. By choosing to view this document, you agree to all provisions of the copyright laws protecting it.
Oliver Williams. Bayesian Learning for Efficient Visual Inference, September 2005.