Arista is a data-driven image annotation system, which annotates an image based on large-scale image search. Its assumption is that close similar images share similar semantics. It targets at a practical image annotation engine which is able to automatically annotate images of any popular concepts. Starting from 2006, Arista is now able to perform online tagging based on 2 billion web images leveraging near-duplicate detection technique.
Background, Motivation and Basic Idea
The key hinder factor of computer vision research is the semantic gap between existing low-level visual features and high-level semantic concepts. Traditional image auto-annotation approaches attempted to directly map visual features to textual keywords. However, since these two types of features are heterogeneous, the intrinsic mapping function is totally in the dark.
Arista adopts a search-to-annotation strategy. Its basic idea is that close-similar images share similar semantics. Leveraging a large-scale partly annotated image database, it annotates an image by first search for a number of close-similar images in the content-based image retrieval framework, and then mine relevant terms/phrases from their surrounding texts. In this way, Arista avoids the semantic gap problem to a certain extent by measuring similarities in homogeneous feature spaces( i.e. image against image and text against text).