LETOR4.0 Downloads

To use the datasets, you must read and accept the online agreement. By using the datasets, you agree to be bound by the terms of its license.

Datasets

Note that the two semi-supervised ranking datasets have been updated on Jan. 7, 2010. Please download the new version if you are using the old ones.
Setting Datasets Size
Supervised ranking MQ2007 ~ 65M
MQ2008 ~ 15M
Semi-supervised ranking MQ2007-semi ~ 940M
MQ2008-semi ~ 650M
Rank aggregation MQ2007-agg ~ 20M
MQ2008-agg ~ 4M
Listwise ranking MQ2007-list ~ 950M
MQ2008-list ~ 670M
Feature list for supervised ranking, semi-supervised ranking and listwise ranking can be found in this document.

Evaluation tools

The evaluation scripts for LETOR4.0 are a little different from those for LETOR3.0.
Please do not use the tools across LETOR3.0 and LETOR4.0.
Evaluation script for supervised ranking, semi-supervised ranking and rank aggregation
Evaluation script for listwise ranking
Significance test script for all the four settings

Possible issues
If you are using a linux machine and meet some problems with the scripts, you may try the solution from Sergio Daniel. Thank Sergio for sharing!
-------------------------
The evaluation script (http://research.microsoft.com/en-us/um/beijing/projects/letor//LETOR4.0/Evaluation/Eval-Score-4.0.pl.txt) isn't working for me on the letor 4.0 MQ2008 dataset. I use perl v5.14.2 on a linux machine. I made a little modification and now it is running =)
I replaced the line:
if ($lnFea =~ m/^(\d+) qid\:([^\s]+).*?\#docid = ([^\s]+) inc = ([^\s]+) prob = ([^\s]+)$/)
with:
if ($lnFea =~ m/^(\d+) qid\:([^\s]+).*?\#docid = ([^\s]+) inc = ([^\s]+) prob = ([^\s]+).$/)

Sergio.
-------------------------


Low level information

Data Size
Meta data Meta data for MQ2007 query set ~ 60M
Meta data for MQ2008 query set ~ 50M
Collection info ~1 k
Relation information Link graph of Gov2 collection ~ 480M
Sitemap of Gov2 collection ~ 65M
Similarity for MQ2007 query set ~ 4.3G
similarity for MQ2008 query set ~ 4.9G
2009 Microsoft Corporation. All rights reserved.  Terms of Use | Trademarks | Privacy Statement