MSRA-CFW: Data Set of Celebrity Faces on the Web

MSRA-CFW is a data set of celebrity face images collected from the web. Starting from any face image, we obtain its near-duplicate images and associated surrounding texts. Then we detect the dominant people names by matching with a large list of celebrity names from public websites such as Wikipedia. A classifier is applied to further identify the celebrities appearing in the web images. The final dataset contains 202792 faces of 1583 people.
More details ...

Lei Zhang (张磊)

Senior Applied Researcher

Multimedia Search Team
Search Technology Center Asia

No. 5 Danling Street, Haidian District
Beijing 100080, P.R.China
Email: leizhang AT