Automatic Image Dataset Construction from Click-through Logs Using Deep Neural Network

  • Yalong Bai ,
  • Kuiyuan Yang ,
  • Wei Yu ,
  • Chang Xu ,
  • Wei-Ying Ma ,
  • Tiejun Zhao

ACM Multimedia |

Labeled image datasets are the backbone for high-level image understanding tasks with wide application scenarios, and continuously drive and evaluate the progress of feature designing and supervised learning models. Recently, the million scale labeled image dataset further contributes to the rebirth of deep convolutional neural networks which bypass manual designing handcraft features. However, the construction process of image dataset is mainly manual-based and quite labor intensive, which often take years’ efforts to construct a million scale dataset with high quality. In this paper, we propose a deep learning based method to construct large scale image dataset in an automatic way. Specifically, word representation and image representation are learned in a deep neural network from large amount of click-through logs, and further used to define word-word similarity and image-word similarity. These two similarities are used to automatize the two labor intensive steps in manual-based image dataset construction: query formation and noisy image removal. With a new proposed cross convolutional filter regularizer, we can construct a million scale image dataset in one week. Finally, two image datasets are constructed to verify the effectiveness of the method. In addition to scale, the automatically constructed dataset has comparable accuracy, diversity and cross-dataset generalization with manually labeled image datasets.