Overview of the TREC 2014 Web Track

  • Kevyn Collins-Thompson ,
  • Craig Macdonald ,
  • Paul N. Bennett ,
  • Fernando Diaz ,
  • Ellen M. Voorhees

In Proceedings of the 23rd Text REtrieval Conference (TREC '14) |

The goal of the TREC Web track over the past few years has been to explore and evaluate innovative retrieval approaches over large-scale subsets of the Web – currently using ClueWeb12, on the order of one billion pages. For TREC 2014, the sixth year of the Web track, we implemented the following significant updates compared to 2013. First, the risk-sensitive retrieval task was modified to assess the ability of systems to adaptively perform risk-sensitive retrieval against multiple baselines, including an optional selfprovided baseline. In general, the risk-sensitive task explores the tradeoffs that systems can achieve between effectiveness (overall gains across queries) and robustness (minimizing the probability of significant failure, relative to a particular provided baseline). Second, we added query performance prediction as an optional aspect of the risk-sensitive task. The Adhoc task continued as for TREC 2013, evaluated using both adhoc and diversity relevance criteria