Share this page
Share this page E-mail this page Print this page RSS feeds
Home > People > Chris J.C. Burges
Chris J.C. Burges

PRINCIPAL RESEARCHER
.

The TMSN Group at Microsoft Research, Redmond

I'm a Principal Researcher in, and manager of, the Text Mining, Search and Navigation (TMSN) group at Microsoft Research.  Our work covers a broad spectrum of research activities, from fundamental, theoretical machine learning, to user studies for evaluating new ideas and for comparative assessments, to computing and applying string similarity metrics using log data, to applying high level semantic information extracted from online sources such as Wikipedia, to building platforms for experimenting with new algorithms and new user interfaces, to modeling user behaviour to better target ads.  Our ranking and classification technologies are used throughout Microsoft's Live Search, online Ads (AdCenter), and other online services: for example our ranking technology is used for core Web Search, for Web Search verticals like image and commerce search, and for new projects such as improving Ads relevance. We are interested in working with structured data, often represented as graphs: for example AdCenter uses our user-modeling work to better target online ads.  We also help with systems engineering and data integrity.  Our work on semantic information extraction has a multitude of uses - for example, to automatically segment video transcripts for MSNBC.  I invite you to browse the team's web pages to find out more!

 

My Own Research

I'm interested in machine learning, optimization methods, information retrieval, and algorithms for Web applications.  I'm particularly interested in finding principled methods to solve Web-sized problems, and in leveraging Web-sized problems to discover new principled methods.  Here are a few things I've been working on recently - for more information please visit my publications page.

 

Ranking for Information Retrieval 

Call for Papers for the NIPS 2009 Advances in Ranking Workshop!

RankNet is a simple and effective way to learn how to rank using pair-based neural network training.  RankNet models the probability of one item being ranked higher than another, and so is well suited to optimizing the area under the ROC curve, which measures pairwise errors.  LambdaRank builds on RankNet in that it is aimed at directly optimizing for the kinds of cost functions that are of interest for information retrieval.  This is a particularly challenging learning task since such measures, viewed as functions of the model parameters, are either flat or discontinuous everywhere.  We recently showed that LambdaRank in fact provides a simple method for directly optimizing IR measures.  Recently we explored approaching ranking as classification, and then combined the ideas in LambdaRank with MART, a boosted trees algroithm, to construct LambdaMART, which gives improved flexibility and accuracy.

 

Review Articles and Talks

Although we usually don't get to teach classes directly at MSR, there are still many opportunities to teach (and even more to learn).  In TMSN, we have a Learning Theory Book Club, a Machine Learning Reading Group, and bi-weekly Tea Times.

Here's a draft of a tutorial review article on Dimension Reduction.  It covers many well-known, and some less well-known, methods for dimension reduction for which the inferred variables are continuous.  Here's a lecture I gave recently at the University of Washington - part 1 of 2 - on the mathematical foundations for machine learning.  It's an updated version of lectures I gave at the machine learning summer school in the Max Planck institute in Tuebingen in 2003; here's a condensed version of those lectures.

 

Audio Fingerprinting

You have an incoming stream of audio and you'd like to know what's playing. Our RARE (Robust Audio Recognition Engine) system can identify any one of about a quarter million songs in real time using about 10% CPU on an 833 MHz PC. On 36 hours of noisy test audio, it achieves 0.2% false positives at 4.10-6 false negative rate. Confirmation fingerprints can be used to significantly further improve these error rates, with almost no extra CPU cost. Our work is currently used in Windows Media Player and in the Zune media player.  Audio fingerprinting has lots of applications: for example, to automatically construct audio thumbnails, and to automatically find duplicate audio clips on your PC. Our main innovations are a method to train for robustness to distortions, and a lookup method that is over an order of magnitude faster than competing methods for this problem.  See here for details. Joint work with J. Platt, J. Goldstein, E. Renshaw, C. Herley.

 

Personal Interests

The hiking in western Washington is great! Here are some views from the Central Cascades: Rachel Lake, Mailbox PeakRampart Lakes and Rainbow Lake.