Automated Tools for Phenotype Extraction from Medical Records

  • Meliha Yetisgen-Yildiz ,
  • Cosmin A. Bejan ,
  • Lucy Vanderwende ,
  • Fei Xia ,
  • Heather L. Evans ,
  • Mark M. Wurfel

Proceedings of the American Medical Informatics Association Clinical Research Informatics Summit (AMIA CRI'13) |

Published by American Medical Informatics Association

Clinical research studying critical illness phenotypes relies on the identification of clinical syndromes defined by consensus definitions. Historically, identifying phenotypes has required manual chart review, a time and resource intensive process. The overall research goal of Critical Illness PHenotype ExtRaction (deCIPHER) project is to develop automated approaches based on natural language processing and machine learning that accurately identify phenotypes from EMR. We chose pneumonia as our first critical illness phenotype and conducted preliminary experiments to explore the problem space. In this abstract, we outline the tools we built for processing clinical records, present our preliminary findings for pneumonia extraction, and describe future steps.