Product Image Categorization Data Set (PI 100)
Product Image Categorization
Image categorization usually refers to "the labeling of images into one of a number of predefined categories", which has been an important research topic in past decades. Different taxonomy definitions lead to different problems with very different solutions. For example, a few previous approaches categorize images by their generation method, such as photographs and graphics. There are also some work defining image category as a high-level concept hierarchy, such as indoor/outdoor or cityscape/landscape.
In this data set, we focus on a particular type of taxonomy: product category. In other words, an image is categorized by the product type of its main object, such as camera or guitar. We realize that in general, categorizing objects is a very difficult computer vision task. In order to build a practical system, two requirements are introduced for our data set:
- Main object dominance
The image content in our data set is about single object or one dominant object. We neither aim to detect object from complex background nor to categorize images as high level concept which is usually expressed as a multi-object scene.
- Appearance stableness
We deal with images containing the objects in relatively stable forms and appearances. For example, a product category "keyboard" is within our scope, while "beef" is not. The latter one has too many different forms and various appearances.
These two requirements, though strict in some sense, match real-life applications in many scenarios. Most product images on the Web follow the two requirements above. A potential application of our data set is collecting data for Web product search services, such as Froogle (http://froogle.google.com). Another application is mobile search. For example, a user can capture a photo of a product he is interested in and search for similar photos on the Web, to get more information.
Data Set - PI 100
One million candidate images in 400 categories were collected from MSN shopping web site (http://shopping.msn.com/). Two graduate students were invited to select 100 categories with 120 images in each category from these images. When selecting the database images, we asked them to follow the two requirements and remove all duplicate images.
We randomly split the 120 images of each category into two sets: 100 as the database set, and 20 as the query set. The following figure shows one sample image from each category in our database.
- Data Set: 10K product images in 100 categories (resolution 100x100) [download (zip, 26.3M), part 1, part 2]
- Query Set: 2K query images in 100 categories (resolution 100x100) [download (zip, 4.3M)]
Please cite the following paper when using this dataset:
Xing Xie, Lie Lu, Menglei Jia, Hua Li, Frank Seide, Wei-Ying Ma, Mobile Search with Multimodal Queries, Proceedings of the IEEE, Vol. 96, No. 4, Apr. 2008.