|
|
Media Computing
Rapid advancement of the Internet and digital storage technologies has
caused an explosion of multimedia data. The Media Computing (MC) group
at MS Research China is working on next-generation multimedia processing
and system technologies to enable users to access the information they
desire in whatever format they prefer, via any information appliance, at
any place in the world and at any time.
We focus our research on pattern recognition, media content analysis and
summarization, transformation of unstructured visual data into
structured and easily accessible information, and multimedia search and
retrieval.
Primary Contact: Hong-Jiang Zhang
| | | | | Affiliate Members
| 
Steven | 
Tie-Yan | 
Tim | 
Zhengyou | 
Hong-Jiang | 
Zou, Xinli | | |
Pattern Recognition and Machine Learning: This project is aimed at understanding fundamental problems
in media computing, and developing new techniques and algorithms for analysis and classification of real world
image, video and audio data. The basic issues are: (1) understanding intrinsically low-dimensional structures
or sub-manifolds of patterns of interest embedded in high dimensional data, and (2) discriminating between
different patterns. The topics include example-based learning, linear and nonlinear subspace analysis,
statistical and neural network methods for modeling and classification.
Audio Content Analysis: This project is aimed to develop technologies and algorithms for segmentation,
classification and retrieval of audio data. An audio clip is segmented and classified in terms of semantic
classes of the sound, such as speech, music, background sound and silence. Based on these technologies, we are
able to find sounds in a database which are similar in content to a given audio clip. Applications include
music/song retrieval by humming, and speaker segmentation and identification.
Digital Album: The goal is to develop technologies for efficient management of personal photo images.
The users will be able to automatically annotate and search their photos in terms of names, places, data and
time, events, and examples, and so on.
Image Retrieval: The goal is to develop the next generation of image search and retrieval technologies,
for users to efficiently and effectively find their intended content from the vast amount of information on the
Internet. We currently focus on the following research topics: image feature representation, visual concept
learning, relevance feedback, automatic annotation, user log mining, web image indexing, etc.
Video Content Analysis, Representation and Access: The goal is to develop advanced digital video
technologies that can assist users to manage, search, and enjoy videos. Content analysis leads to a structural
content-based representation for effective indexing, random access, and content-based classification and retrieval.
The key technologies include motion segmentation, shot boundary detection, key-frame extraction, event detection,
scene grouping and anchor person detection. Integrating these technologies with other information contained in
video, such as audio, speech, and text, we aim to provide a systematical solution for digital video management
and service.
Face Detection, Tracking and Recognition: This project is aimed to develop techniques and algorithms
for automated face recognition. The research topics are fast and reliable face detection, tracking, alignment
and recognition under varying viewpoints and illumination conditions. Applications include digital album, image
and video indexing and retrieval.
|