Agate targets at automatically deriving, from user-generated photos, a one-page visual summary of a location that captures the key sites of interest. In an interactive setting, a user can see “canonical views” of each site of interest, and browse photos that correspond to each canonical view. When textual descriptors of the photos are available, we augment the visual summaries with semantics obtained from analyzing the statistics of image tags.
The Framework and Key Tech
Agate provides a visual summary of a location by learning its canonical views from top-interested images shared by web users.
Its key tech is a non-parametric clustering algorithm called MoM-DPM Sets model, which can automatically determine the number of clusters, seamlessly combine heterogeneous features, and unify clustering and ranking.
Figure 1 and Figure 2 illustrate the clustering process. A new image is matched to all available clusters, and is assigned to its best-match cluster or constructs a new cluster with certain probabilities.
Figure 1. If a match can be found, the new image is assigned to its best-match cluster.
Figure 2. If no matches can be found, the new image makes up of a new cluster.
Knowledge Mined by Agate
Agate outputs suggest interesting knowledge about a place. For example, it can be seen that San Francisco is famous both for its natural scenes (e.g. Golden Gate Bridge, Seal Rocks) and its historical sites (e.g. Alamo Square, Hunters Point, Cliff House).
Contrarily, Monetary is famous for its natural scenes, while San Clara is famous for its city views and high-tech region and it has less interesting sites.
Agate can also identify the best views of a place (the length of a triangle suggests the degree of interestingness). For example, the "sunset" view is the most famous for "half moon bay”.
Yuheng Ren, Mo Yu, Xin-Jing Wang, Lei Zhang, Wei-Ying Ma. Diversifying Landmark Image Search Results by Learning Interested Views from Community Photos, 19th Inter. World Wide Web Conf. (WWW), Demo, 2010.