Searching Locally-Defined Entities

  • Zhaohui Wu ,
  • Yuanhua Lv ,
  • Ariel Fuxman

Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management |

Published by ACM

Publication

When consuming content, users typically encounter entities that they are not familiar with. A common scenario is when users want to find information about entities directly within the content they are consuming. For example, when reading the book “Adventures of Huckleberry Finn”, a user may lose track of the character Mary Jane and want to find some paragraph in the book that gives relevant information about her. The way this is achieved today is by invoking the ubiquitous Find function (“Ctrl-F”). However, this only returns exact-matching results without any relevance ranking, leading to a suboptimal user experience.

How can we go beyond the Ctrl-F function? To tackle this problem, we present algorithms for semantic matching and relevance ranking that enable users to effectively search and understand entities that have been defined in the content that they are consuming, which we call locally-defined entities. We first analyze the limitations of standard information retrieval models when applied to searching locally-defined entities, and then we propose a novel semantic entity retrieval model that addresses these limitations. We also present a ranking model that leverages multiple novel signals to model the relevance of a passage. A thorough experimental evaluation of the approach in the real-word application of searching characters within e-books shows that it outperforms the baselines by 60%+ in terms of NDCG.