Viewing Term Proximity from a Different Perspective

  • Ruihua Song ,
  • Ji-Rong Wen ,
  • Wei-Ying Ma

MSR-TR-2005-69 |

Various approaches have been explored to utilize word proximity information to improve the effectiveness of text retrieval system. In previous works, loose phrases, composed of several query terms that may be intervened by other words, are directly considered as additional independent query terms, but in fact some of them are overlapped. This paper revisits the term proximity scoring problem, and proposes a new method to incorporate term proximity into ranking functions. The new method is different from priors in three aspects: (1) close query terms are matched up to compose non-overlapped expanded spans which represent contexts of query terms, (2) the contribution of a query term to relevance is determined by both its contexts and frequency, (3) it is relatively easy to plug this method into existing ranking functions with the part of inverse document frequency preserved. Experimental results on TREC-9,10,11 collections showed that the proposed approach consistently improved retrieval precision.