Speaker Alice Oh
Affiliation Korea Advanced Institute of Science and Technology
Host Jaime Teevan
Date recorded 2 December 2010
Online reviews are often unstructured. We can extract useful information from the unstructured reviews by automatically discovering what aspects people evaluate and what words they use to express sentiment toward these aspects. A generative topic model, such as LDA, is a good candidate, but it neglects the positions of the words within a document for topic inference. However, words from an aspect of a review tend to co-occur in close proximity to one another, so we propose Sentence-LDA (SLDA) with a constraint that all words in a single sentence are generated from one aspect. We then extend SLDA to Aspect and Sentiment Unification Model (ASUM) to unify aspect and sentiment. This model discovers pairs of aspect, sentiment which we call senti-aspects. We applied SLDA and ASUM to reviews of electronic devices and restaurants. The results show that the aspects discovered by SLDA match important evaluative details of the reviews, and senti-aspects found by ASUM capture aspects that are rarely mentioned without sentiment. Another advantage of ASUM is that it finds aspect-specific sentiment words without using the sentiment label of the reviews. As a quantitative evaluation of senti-aspects, we performed sentiment classification. ASUM outperformed other generative models and came close to supervised classification methods.
©2010 Microsoft Corporation. All rights reserved.