LETOR: Learning to Rank for Information Retrieval

Overview

This website is designed to facilitate research in LEarning TO Rank (LETOR). Much information about learning to rank can be found in the website, including benchmark datasets, public baselines, published papers, research communities, and passed and coming events.

Microsoft Learning to Rank Datasets have been released.

  • Two large datasets with tens of thousands of queries (30,000+/10,000) were released.
  • 136 features have been extracted for each query-url pair.
  • The datasets can be downloaded at Microsoft Research website.

Learning to rank challenge organized by Yahoo! Labs was launched.

  • Two large datasets with tens of thousands of queries were released.
  • The challenge consisted of two tracks: a standard learning to rank track as well as a transfer learning one.
  • It was open to all research groups in academia and industry.
  • Details can be found at the challenge website.

LETOR 4.0 was released in July 2009.

  • Two large scale query sets were used, with thousands of queries;
  • Datasets for four kinds of ranking settings were provided: supervised ranking, semi-supervised ranking, rank aggregation, and listwise ranking;
  • Low level features were included for investigation;
  • Several baselines were included.

LETOR 3.0 baselines were updated in June 2009.


LETOR 3.0 was released in Dec, 2008.

  • Add four new datasets: homepage finding 2003, homepage finding 2004, named page finding 2003 and named page finding 2004. Plus the three datasets (OHSUMED, topic distillation 2003 and topic distillation 3004) in LETOR2.0, there are seven datasets in LETOR3.0.
  • New document sampling strategy for each query; and so the three datasets in LETOR3.0 are different from those in LETOR2.0;
  • New low level features for learning;
  • Meta data is provided for better investigation of ranking features;
  • More baselines.

LETOR 2.0 was released in December, 2007.

  • More baseline results on the LETOR dataset, including ListNet, AdaRank, FRank and MHR.
  • Updated evaluation tools (Eval-Rank.pl and Eval-ttest.pl).
  • A channel to accept more baselines from other researchers.
  • A discussion board for researchers to exchange their ideas on learning to rank and the LETOR dataset.
  • A hub for researcher/research groups, papers and resources on learning to rank.

Reference

  • More details of LETOR can be found at this paper "LETOR: A Benchmark Collection for Research on Learning to Rank for Information Retrieval". [pdf] [DOI]
  • LETOR 3.0 can be cited as
    Tao Qin, Tie-Yan Liu, Jun Xu, and Hang Li. LETOR: A Benchmark Collection for Research on Learning to Rank for Information Retrieval, Information Retrieval Journal, 2010.
  • LETOR 4.0 can be cited as
    Tao Qin and Tie-Yan Liu. Introducing LETOR 4.0 Datasets, arXiv preprint arXiv:1306.2597. [pdf]

===== Editor: Tao Qin, Tie-Yan Liu; Designer: Ruochi Zhang ======

©2009 Microsoft Corporation. All rights reserved.  Terms of Use | Trademarks | Privacy Statement