A General Approximation Framework for Direct Optimization of Information Retrieval Measures

MSR-TR-2008-164 |

Recently direct optimization of information retrieval (IR) measures becomes a new trend in learning to rank. Sev- eral methods have been proposed and the e®ectiveness of them has also been empirically veri¯ed. However, theoret- ical justi¯cation to the algorithms was not su±cient and there were many open problems remaining. In this paper, we theoretically justify the approach of directly optimizing IR measures, and further propose a new general framework for this approach, which enjoys several theoretical advan- tages. The general framework, which can be used to op- timize most IR measures, addresses the task by approxi- mating the IR measures and optimizing the approximated surrogate functions. Theoretical analysis shows that a high accuracy can be achieved by the approach. We take average precision (AP) and normalized discounted cumulative gains (NDCG) as examples to demonstrate how to realize the pro- posed framework. Experiments on benchmark datasets show that our approach is very e®ective when compared to exist- ing methods. The empirical results also agree well with the theoretical results obtained in the paper.