Extracting medical events from clinical records using conditional random fields and parameter tuning for hidden Markov models

Carolina Fócil-Arias, Grigori Sidorov, Alexander Gelbukh, Fernando Arce

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Recently, the extraction of clinical events from unstructured medical texts has attracted much attention of the research community. Machine learning approaches are popular for this task, due to their ability to solve the problem of sequence tagging effectively. It has been suggested previously that simple features, such as word unigrams, part-of-speech tags, chunk tags, among others, are sufficient for this task. We show that more careful preprocessing and feature selection can significantly improve the results. We used conditional random field classifier with more linguistically oriented features and outperformed the current state-of-the-art approaches.We also show that the popular and much simpler Viterbi algorithm (hidden Markov model-based classification algorithm) can produce competitive results, when its parameters are tuned using specific optimization techniques. We evaluate these algorithms for the task of extraction of medical events from the corpus developed for SemEval shared Task 12: Clinical TempEval (Temporal Evaluation) 2016, namely, for its two subtasks: (i) event detection and (ii) event classification based on contextual modality.

Original languageEnglish
Pages (from-to)2935-2947
Number of pages13
JournalJournal of Intelligent and Fuzzy Systems
Volume34
Issue number5
DOIs
StatePublished - 2018

Keywords

  • Clinical reports
  • Conditional random field
  • Feature selection
  • Machine learning
  • Medical information extraction
  • Natural language processing
  • Viterbi algorithm

Fingerprint

Dive into the research topics of 'Extracting medical events from clinical records using conditional random fields and parameter tuning for hidden Markov models'. Together they form a unique fingerprint.

Cite this