Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, and Wei Wang
Opinion retrieval engines aim to retrieve documents containing user opinions towards a given search query. Different from traditional IR engines which rank documents by their topic relevance to the search query, opinion retrieval engines also consider opinion relevance. The result documents should contain user opinions which should be relevant to the search query. In previous opinion retrieval algorithms, opinion relevance scores are usually calculated by using very straightforward approaches, e.g., the distance between search query and opinion-carrying words. These approaches may cause two problems: 1) opinions in the returned result documents are irrelevant to the search query;
2) opinions related to the search query are not well identified. In this paper, we propose a new approach to deal with this topicopinion mismatch problem. We leverage the idea of Probabilistic Latent Semantic Analysis. Both queries and documents are represented in a latent topic space, and then opinion relevance is calculated semantically in this topic space. Experiments on the TREC blog datasets indicate that our approach is effective in measuring opinion relevance and the opinion retrieval system based on our algorithm yields significant improvements compared with most state-of-the-art methods.
In The 3rd IEEE International Conference on Social Computing (SocialCom2011)