Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Clustering Videos by Location

Florian Schroff, C. Lawrence Zitnick, and Simon Baker


We propose an algorithm to cluster video shots by the location in which they were captured. Each shot is represented as a set of keyframes and each keyframe is represented by a histogram of textons. Clustering is performed using an energy-based formulation. We propose an energy function for the clusters that matches the expected distribution of viewpoints in any one location and use the chi-squared distance to measure the similarity of two shots. We also add a temporal prior to model the fact that temporally neighboring shots are more likely to have been captured in the same location. We test our algorithm on both home videos and professionally edited footage (sitcoms). Quantitative results are presented to justify each choice made in the design of our algorithm, as well as comparisons with k-means, connected components, and spectral clustering.


Publication typeInproceedings
Published inProceedings of the British Machine Vision Conference
> Publications > Clustering Videos by Location