Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Machine Learning in IR: Recent Successes and New Opportunities

This tutorial was presented at the 26th International Conference on Machine Learning (ICML-2009) on June 14,  2009


Paul Bennett (
Misha Bilenko (
Kevyn Collins-Thompson (

TUTORIAL SLIDES are available here.

The tutorial focuses on the interplay between information retrieval (IR) and machine learning. This intersection of research areas has seen tremendous growth and progress in recent years, much of it fueled by incorporating machine learning techniques into the core of information retrieval technologies, including Web search engines, e-mail and news filtering systems, music and movie recommendations, online advertising systems, and many others. As the complexity, scale, and user expectations for retrieval technology increases, it is correspondingly increasingly important for each field to keep pace with and inform the other.

With that goal in mind, this tutorial covers:

  • The nature of the challenging learning problems faced at many levels by search technology systems today
  • Successful applications of machine learning methods to make progress in key IR tasks
  • Opportunities for joint future progress and emerging research problems which will benefit both machine learning and information retrieval


Tutorial Outline

1. IR and learning-related issues

    • The basic IR paradigm
    • Richness of ML tasks in IR
    • Salient properties of IR tasks
        • Dealing with uncertainty in many problem aspects
        • Evaluation measures
        • Dealing with scale
        • Adversarial challenges & temporal issues (freshness, drift)

2. IR tasks meet learning methods (a.k.a. recent successes)

    • Modeling information needs
    • Learning from user behavior
    • Learning to rank in IR

3.  New opportunities

    • Learning complex structured outputs: diversity, novelty, and redundancy
    • Risk-reward tradeoffs for retrieval algorithms: exploring optimization frameworks, constraints and objectives
    • Computational advertising: from multiple objectives to multi-arm bandits

4. Summary

    • Pointers to data resources
    • Bibliography