Nemanja Petrovic, Aleksandar Ivanovic, Nebojsa Jojic, Sumit Basu, and thomas huang
In this paper we present a generative model and learning procedure for unsupervised video clustering into scenes. The work addresses two important problems: realistic modeling of the sources of variability in the video and fast transformation invariant frame clustering. We suggest a solution to the problem of computationally intensive learning in this model by combining the recursive model estimation, fast inference, and on-line learning. Thus, we achieve real time frame clustering performance. Novel aspects of this method include an algorithm for the clustering of Gaussian mixtures, and the fast computation of the KL divergence between two mixtures of Gaussians. The efficiency and the performance of clustering and KL approximation methods are demonstrated. We also present novel video browsing tool based on the visualization of the variables in the generative model.
Publisher Institute of Electrical and Electronics Engineers, Inc.
© 2004 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.