The area under the ROC curve (AUC) has been advocated as an evaluation criterion for the bipartite ranking problem. We study large deviation properties of the AUC; in particular, we derive a distribution-free large deviation bound for the AUC which serves to bound the expected accuracy of a ranking function in terms of its empirical AUC on an independent test sequence. A comparison of our result with a corresponding large deviation result for the classification error rate suggests that the test sample size required to obtain an epsilon-accurate estimate of the expected accuracy of a ranking function with delta-confidence is larger than that required to obtain an epsilon-accurate estimate of the expected error rate of a classification function with the same confidence. A simple application of the union bound allows the large deviation bound to be extended to learned ranking functions chosen from finite function classes.

}, author = {Shivani Agarwal and Thore Graepel and Ralf Herbrich and Dan Roth}, booktitle = {Advances in Neural Information Processing Systems 17}, month = {January}, pages = {9-16}, publisher = {MIT Press}, title = {A Large Deviation Bound for the Area Under the ROC Curve}, url = {http://research.microsoft.com/apps/pubs/default.aspx?id=65637}, year = {2004}, }