Eric Chang, Yan Xu, Kai Hong, Jianqiang Dong, and Zhaoquan Gu
Object: Ahybrid approach is presented as an application of natural language processing (NLP) to clinical discharge summaries for the i2b2/VArelation challenge. The challenge, focusingon translating narrative text to structuredrepresentationin the medical domain, consistsof three tasks: (1)extraction of concepts including medical problems, treatments,and tests;(2)classification of assertionsmade on medical problems;and (3)identification of relationsbetween a certain medical problem and
another medical problem, treatment,or test.
Design:The overall hybrid approach consists of the following steps: (1)pre-processing the sentences;(2)marking noun phrases(NPs) and adjective phrases(APs) using the improved SharpNLP tools;(3)extracting the concepts using Conditional Random Fields(CRF);(4)classifying medical problems into different assertion categories based on voting by various classifiers; and (5)identifying different relation categoriesusing normalizedsentences and voting by various classifiers.
Measurements:The primary performance metrics of the macro-averaged and micro-averaged precision, recall and F-measure,with exact matching and inexact matching,were measured as evaluated results.
Results:The hybridsystemachieved a micro-averaged F-measure of 0.7973for the concept task, 0.9210 for the assertion task and 0.7012 for the relation task, obtaining XX placefor concept extraction, XX place for assertion classification, and XX place for relation identification.
Conclusions:The submitted results show our hybrid approach is feasible when the training data are sufficient, indicating that the combined techniques of machine learning and NLP have a promising future in applications of narrative electronic medical records.
In Proceedings of the 2010 i2b2/VA/Cincinnati Workshop on Challenges in Natural Language Processing for Clinical Data