Here or There: Preference Judgments for Relevance
- Ben Carterette ,
- Paul Bennett ,
- Max Chickering ,
- Susan Dumais
Proceedings of ECIR 2008 |
Published by Springer
ECIR 2018 Test of Time Honorable Mention
Information retrieval systems have traditionally been evaluated over absolute judgments of relevance: each document is judged for relevance on its own, independent of other documents that may be on topic. We hypothesize that preference judgments of the form “document A is more relevant than document B” are easier for assessors to make than absolute judgments, and provide evidence for our hypothesis through a study with assessors. We then investigate methods to evaluate search engines using preference judgments. Furthermore, we show that by using inferences and clever selection of pairs to judge, we need not compare all pairs of documents in order to apply evaluation methods.
Information retrieval systems have traditionally been evaluatedover absolute judgments of relevance: each document is judgedfor relevance on its own, independent of other documents that may beon topic. We hypothesize that preference judgments of the form “documentA is more relevant than document B” are easier for assessors tomake than absolute judgments, and provide evidence for our hypothesisthrough a study with assessors. We then investigate methods to evaluatesearch engines using preference judgments. Furthermore, we show thatby using inferences and clever selection of pairs to judge, we need notcompare all pairs of documents in order to apply evaluation methods.