Home Video Browsing and Consumption Through Exploration of a Learned Generative Model

Published by Institute of Electrical and Electronics Engineers, Inc.

Research on graphical models has found fertile grounds in computer vision, partly due to this paradigm’s intuitive treatment of hidden causes of variability. The intuitive treatment of hidden variables in graphical models often makes model building, visualization and debugging easier and faster. The same advantage of graphical models can be mined in the next important step beyond data analysis the development of intuitive user interfaces to visual media and visual media consumption tools which empower the user by exposing the results of probabilistic inference in an intuitive way. Here, we provide an illustration of such a browsing/consumption tool for personal media, such as vacation videos. The summary of the video, extracted in real time by maximizing a likelihood under a simple trasnformation-invariant graphical model presented in the main CVPR 2006 conference (Petrovic et al), contains for each frame its inferred class index, and its alignment to the panoramic representation of the class. The user is presented with the set of learned panoramic clusters, and simple mouse hover over the scene element of interest automatically triggers playback of relevant frames in the play window.