|
|
Statistical Media Processing
Statistical Media
Processing
(SMP) is a research project inside of the Knowledge Tools Group. Work in this project develops both
technology and applications that lie in the intersection between media and statistics.
Here, media is
defined as the information that people use for communication, collaboration, entertainment,
and memory archiving. Such media can include audio, still images, and video. In addition,
metadata referring to such information is also valid media, such as descriptive metadata
and user models. Statistics is defined
as the automatic construction of intelligent systems by the examination of data. Statistics
is a superset of machine learning. Statistical media processing includes media identification,
classification, clustering, enhancement, recommendation, organization, and search.
The intersection between media and statistics is a very
fruitful area of research, because each half stimulates new ideas in the other. By
creating new machine learning and statistical algorithms that are appropriate for
media processing, and by training these algorithms with real media gathered in real
situations, we hope to push the state of the art of media processing and make Microsoft
software have the best media processing features. Conversely, the application to media
will push machine learning towards new directions, away from the classical classification
& regression problems. These new directions include representations for media
that are appropriate for machine learning, new algorithms to handle the small amounts
of data typically available for user modeling, and new algorithms for modeling and
enhancing media.
-
Audio
Fingerprinting --- A system which automatically identifies a clip in an audio
stream, even if the stream is distorted or noisy. The system includes technology for
automatically extracting noise-robust features from signals, and a fast database lookup
algorithm.
-
AutoDJ ---
A system for automatically generating music playlists, given one or more seed songs
selected by a user. The system uses a machine learning algorithm that learns from
previous experience.
-
Statistical
Acoustic Signal Processing --- Methods for enhancing audio capture on the PC,
including echo cancellation, denoising, and dereverberation. These enhancements are
based on adaptive filters and advanced statistical methods.
-
AutoAlbum
& PhotoTOC --- A interface that allows users to easily browse their digital
photographs. The interface uses clustering and probabilistic methods to automatically
create a table of contents for a set of images.
Publications on Audio and speech processing:
-
Normalized Double-Talk Detection Based on Microphone and AEC Error Cross-Correlation by M.A Iqbal, J.W. Stokes, S.L. Grant, Proc. IEEE Int'l Conf. on Multimedia and Expo, (2007).
-
Double-talk Detection using Real-time Recurrent Learning by M.A.
Iqbal, J.W. Stokes, J.C. Platt, A.C. Surendran, S.L. Grant, Int'l Workshop on
Acoustic Echo and Noise Control, (2006).
-
Speaker Identification using a Microphone Array and a Joint HMM with Speech
Spectrum and Angle of Arrival by J.W. Stokes, J.C. Platt, S. Basu,
Proc. ICASSP, Vol 3, pp. 736-739, (2006).
-
Robust RLS with Round Robin Regularization including Application to Stereo
Acoustic Echo Cancellation by J.W. Stokes, J.C. Platt, Proc. ICME,
(2006).
-
Acoustic Echo Cancellation for High Noise Environments by A.S. Chhetri, J.W. Stokes, Proc. ICME (2006).
-
Acoustic Echo Cancellation in a Channel with Rapidly Varying Gain
by S. Basu, Proc. ICME, (2006).
-
Hidden Conditional Random Fields for Phone Classification by A. Gunawardana, M. Mahajan, A. Acero, J.C. Platt, Proc. Interspeech, (2005).
- Regression-based Residual Acoustic Echo Suppression
by A. Chhetri, A.C. Surendran, J.W. Stokes, J.C. Platt, International Workshop on Acoustic Echo and Noise Control, (2005).
-
The Audio Epitome: A New Representation for Modeling and Classifying
Auditory Phenomena by A. Kapoor, S. Basu, Proc. ICASSP, Vol. 5, pp.
189-192, (2004).
- Convolutional Networks for Speech Detection
by S. Sukittanon, A.C Surendran, J.C. Platt, and C.J.C. Burges, ICSLP, (2004).
- Logistic Discriminative Speech Detectors using Posterior SNRs
by A.C. Surendran, S. Sukittanon, and J.C. Platt, ICASSP, (2004).
-
Acoustic Echo Cancellation with Arbitrary Playback Sampling Rate
by J.W. Stokes, H.S. Malvar, Proc. ICASSP, Vol. 4, pp. 153-156,
(2004)
Publications on Music analysis and identification:
-
ARGOS: Automatically Extracting Repeating Objects from Multimedia
Streams by C. Herley, IEEE Trans. on Multimedia, Vol. 8, No 1., pp
115-129, (2006).
-
Using Audio Fingerprinting for Duplicate Detection and Thumbnail Generation
by C.J.C. Burges, D. Plastina, J.C. Platt, E. Renshaw, and H.S. Malvar,
Proc. ICASSP, Vol. 3, pp. 9-12, (2005).
-
Accurate Repeat Finding and Object Skipping using Fingerprints
by C. Herley, Proc. ACM Multimedia, pp. 656-665, (2006).
-
Redundant Bit Vectors for Quickly Searching High-Dimensional Regions,
by J. Goldstein, J.C. Platt, C.J.C. Burges, Proc. Sheffield Machine Learning Workshop, Springer Lecture Notes in Computer Science 3635, (2005).
- Extracting Repeats from Media Streams by C. Herley, Proc.
ICASSP, Vol. 5, pp. 913-916, (2004).
-
Distortion Discriminant Analysis for Audio Fingerprinting
by C.J.C. Burges, J.C. Platt, S. Jana, IEEE Trans. on Speech and Audio Processing, Vol. 11, No. 3,
pp. 165-174, (2003).
Publications on Music synthesis and recommendation:
-
MySong: Automatic Accompaniment Generation for Vocal Melodies" by
I. Simon, D. Morris, S. Basu, Proc. CHI, (2008).
-
Audio Analogies: Creating New Music from an Existing Performance by
Concatenative Synthesis by I. Simon, S. Basu, D. Salesin, and M.
Agrawala, Proc. Int'l Conf. on Computer Music, (2005).
-
Inferring Similarity between Music Objects with Application to
Playlist Generation by R. Ragno, C.J.C. Burges, C. Herley, Proc. ACM
Int'l Workshop on Multimedia Information Retrieval, pp. 73-80, (2005).
-
Fast Embedding of Sparse Music Similarity Graphs by J. C. Platt, NIPS 16, pp. 571-578, (2004).
-
Mixing with Mozart by S. Basu, Proc. Int'l Conf. on Computer
Music, (2004).
-
Learning a Gaussian Process Prior for Automatically Generating Music Playlists
by J C. Platt, C.J.C. Burges, S. Swenson, C. Weare, A. Zheng, NIPS 14, pp. 1425-1432, (2002).
Publications on Image processing and display:
-
Home Video Browsing and Consumption through Exploration of a Learned
Generative Model by N. Jojic, S. Basu, and N. Petrovic, Proc. CVPR,
(2006).
-
Recursive Estimation of Generative Models of Video by N. Petrovic, A. Ivanovic, N. Jojic, S. Basu, T. Huang, Proc. CVPR, (2006).
-
Multiple Instance Boosting for Object Detection by P. Viola, J.C. Platt,
C. Zhang, NIPS, Vol 18, pp. 1417-1426, (2006).
-
Occlusion Removal from Minimum Number of Images by C. Herley,
Proc. ICIP, Vol. 2, pp. 1046-1049, (2005).
-
Learning Spatially-Variable Filters for Super-Resolution of Text
by A. Corduneanu, J.C. Platt, Proc. ICIP, (2005).
-
Efficient Inscribing of Noisy Rectangular Objects in Scanned Images
by C. Herley, Proc. ICIP, Vol. 4, pp. 2399-2402, (2004).
-
PhotoTOC: Automatic Clustering for Browsing Personal Photographs
by J.C. Platt, M. Czerwinski, B. Field, Fourth IEEE Pacific Rim Conference on Multimedia (2003)
- Recursive Method to Extract Rectangular Objects from Scans by C.
Herley, Proc. ICIP, Vol. 3, pp. 989-992, (2003).
- Document Capture Using a Digital Camera
by C. Herley, Proc. International Conference on Image Processing, (2001).
-
AutoAlbum: Clustering Digital Photographs Using Probabilistic Model Merging
by J.C. Platt, Proc. IEEE Workshop on Content-Based Access of Image and Video Libraries 2000,
pp. 96-100, (2000).
- Optimal Filtering for Patterned Displays by J.C. Platt,
IEEE Signal Processing Letters, Vol. 7, No. 7, pp. 179-181, (2000).
- Displaced Filtering for Patterned Displays by C. Betrisey,
J.F. Blinn, B. Dresevic, B. Hill, G. Hitchcock, B. Keely, D.P. Mitchell, J.C. Platt, T. Whitted, Proc. Society for Information Display Symposium, pp. 296-299, (2000).
|