Yan Xu, Yue Wang, Jiahua Liu, and Eric Chang
Objective:To create a sentiment classification system for the fifth i2b2 Challenge Track 2. The classification schema consists of thirteen subjective categoriesand two objective categories.
Design:We developed a hybrid system incorporatingexternalresourcesand SVMclassifiers. The system consists of three types ofclassification methods: onetype withspanningn-gramfeatures for subjective categories, anotherwithbag-of-n-gram featuresfor objective categories; and thethird type based on uni-gram featurefor those subjective categories which need semanticunderstanding. The spanningn-grams are selected by leveraging emotional corpus from weblogs. Special normalization of objective sentences isgeneralized withshallow parsing and external knowledge.We utilizedthe three external resources: the weblog of LiveJournal which can help feature selection, the eBay List which can help special normalization in informationand instructions, andthe suicide project web which can provide the unlabeled data with the similar properties of suicide notes.
Measurements: The performance is evaluated by theoverall micro-averaged precision, recall and F-measure.
Result:The system achieved an overall micro-averaged F-measure of 0.59. Happinesspeacefulnessachieved the highest F-measure of 0.81; Abuse and forgivenesshavethe lowest F-measure of 0.
Conclusion:Our resultsindicated that it is not trivial to classify fine-grained sentimentsat the sentence level.In addition, the experiment results showthat more labeled training data yield better performance for machinelearning approach.
|Published in||Proceedings of the 2011 i2b2/VA/Cincinnati Workshop on Challenges in Natural Language Processing for Clinical Data|