This dataset named Microsoft Research Asia Multimedia (MSRA-MM), which aims to encourage research in multimedia information retrieval and related areas. The images and videos in the dataset are collected from Internet search engines and the performance of state-of-the-art industrial techniques can be evaluated accordingly. The dataset currently have two versions (1.0 and 2.0)
MSRA-MM 1.0 (Release Date: March 16 2009)
MSRA-MM 1.0 consists of two sub-datasets, i.e., an image dataset and a video dataset that are collected from the image and video search engines. For image dataset, we have selected 68 representative queries based on the query log of search engies and then collect about 1000 images for each query. There are 65443 images in all. For video dataset, we have selected 165 representative queries from query log and collect 10277 videos accordingly. Features and annotations are provided. More details please read the technical report.
MSRA-MM 2.0 (Release Date: July 10 2009)
The image part contains about 1 million images from 1165 queries and the video part contains 23 thousands of videos. The associated web pages are also downloaded and surrounding texts are extracted. More details please read the dataset description.
ICDM workshop on Internet Multimedia Mining 2009, to be held on December 6 2009, Miami, Florida, USA. More details please visit the workshop website.
Download the Dataset
- Meng Wang, Linjun Yang, and Xian-Sheng Hua, MSRA-MM: Bridging Research and Industrial Societies for Multimedia Information Retrieval, no. MSR-TR-2009-30, 16 March 2009.