Global Ranking of Documents Using Continuous Conditional Random Fields

MSR-TR-2008-156 |

This paper is concerned with ranking model construction in document retrieval. Traditionally, the ranking model is defined as a function of a query and a document. In practice, many factors affecting ranking can and must be taken into consideration, for instance, similarities between documents and hyper links between documents. One needs to exploit a new ranking model which is a function of a query and the entire set of documents retrieved with the query. This paper names this new problem ‘global ranking of documents’, in contrast to traditional ‘local ranking of documents’. The paper proposes a novel learning to rank method to perform the task. The method employs Continuous Conditional Random Fields (CRF) as model, which is a conditional probability distribution representing the mapping relationship from the retrieved documents to their ranking scores. The model can naturally utilize as features the content information of documents as well as the relation information between documents for global ranking. A learning algorithm for creating Continuous CRF is also presented in the paper. Taking Pseudo Relevance Feedback and Topic Distillation as examples, this paper shows how the learning method can be applied to global ranking. Experimental results on benchmark data show that the proposed method outperforms the baseline methods.