Estimating Query Performance using Class Predictions

  • Kevyn Collins-Thompson ,
  • Paul Bennett

Poster-Paper in Proceedings of the 32nd Annual ACM SIGIR Conference (SIGIR 2009) |

Published by ACM

We investigate using topic prediction data, as a summary of document content, to compute measures of search result quality. Unlike existing quality measures such as query clarity that require the entire content of the top-ranked results, class-based statistics can be computed efficiently online, because class information is compact enough to precompute and store in the index. In an empirical study we compare the performance of class-based statistics to their language model counterparts for predicting two measures: query difficulty and expansion risk. Our findings suggest that using class predictions can offer comparable performance to full language models while reducing computation overhead.