Xian-Sheng Hua, Linjun Yang, Jingdong Wang, Jing Wang, Ming Ye, Kuansan Wang, Yong Rui, and Jin Li
21 October 2013
The semantic gap between low-level visual features and high-level semantics has been investigated for decades but stillremains a big challenge in multimedia. When "search" became one of the most frequently used applications, "intent gap", the gap between query expressions and users' search intents, emerged. Researchers have been focusing on three approaches to bridge the semantic and intent gaps: 1) developing more representative features, 2) exploiting better learning approaches or statistical models to represent the semantics, and 3) collecting more training data with better quality. However, it remains a challenge to close the gaps. In this paper, we argue that the massive amount of click data from commercial search engines provides a data set that is unique in the bridging of the semantic and intent gap. Search engines generate millions of click data (a.k.a. image-query pairs), which provide almost "unlimited" yet strong connections between semantics and images, as well as connections between users' intents and queries. To study the intrinsic properties of click data and to investigate how to effectively leverage this huge amount of data to bridge semantic and intent gap is a promising direction to advance multimedia research. In the past, the primary obstacle is that there is no such dataset available to the public research community. This changes as Microsoft has released a new large-scale real-world image click data to public. This paper presents preliminary studies on the power of large-scale click data with a variety of experiments, such as building large-scale concept detectors, tag processing, search, definitive tag detection, intent analysis, etc., with the goal to inspire deeper researches based on this dataset.
|Published in||ACM Multimedia 2013|
|Publisher||ACM Conference on Multimedia|
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from firstname.lastname@example.org. MM’13, October 21–25, 2013, Barcelona, Spain. Copyright 2013 ACM 978-1-4503-2404-5/13/10 ...$15.00. http://dx.doi.org/10.1145/2502081.2502100 .