Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis

Michael Gamon

Abstract

We demonstrate that it is possible to perform automatic sentiment classification in the very noisy domain of customer feedback data. We show that by using large feature vectors in combination with feature reduction, we can train linear support vector machines that achieve high classification accuracy on data that present classification challenges even for a human annotator. We also show that, surprisingly, the addition of deep linguistic analysis features to a set of surface level word n-gram features contributes consistently to classification accuracy in this domain.

Details

Publication typeInproceedings
Published inProceeding of COLING-04, the 20th International Conference on Computational Linguistics
URLhttp://www.coling.org/
Pages841–847
AddressGeneva, CH
PublisherInternational Conference on Computational Linguistics
> Publications > Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis