Simon Hill, Hugo Zaragoza, Ralf Herbrich, and Peter J. Rayner
January 2002
In this paper we study the problem of generalisation in information retrieval. In particular we study precision-recall curves and the average precision value. We provide two types of bounds: large-deviation bounds of the average precision and maximum deviation bounds with respect to a given point of the precision recall curve. The first type of bounds are useful to answer the question: how far can true average precision be from the value observed on a test collection? The second is useful for obtaining bounds on average precision when tight bounds on a particular point of the curve can be established, as is the case when training SVMs or Perceptrons for document categorisation.
![]() PostScript file |
In: Proceedings of the ACM SIGIR Workshop on Mathematical and Formal Methods in Information Retrieval
| Type: | Inproceedings |