Abstract:Objective The authors' goal was to develop and evaluate machine-learning-based approaches to extracting clinical entities including medical problems, tests, and treatments, as well as their asserted status from hospital discharge summaries written using natural language. This project was part of the 2010 Center of Informatics for Integrating Biology and the Bedside/Veterans Affairs (VA) natural-language-processing challenge.Design The authors implemented a machine-learning-based named entity recognition system for clinical text and systematically evaluated the contributions of different types of features and ML algorithms, using a training corpus of 349 annotated notes. Based on the results from training data, the authors developed a novel hybrid clinical entity extraction system, which integrated heuristic rule-based modules with the ML-base named entity recognition module. The authors applied the hybrid system to the concept extraction and assertion classification tasks in the challenge and evaluated its performance using a test data set with 477 annotated notes.Measurements Standard measures including precision, recall, and F-measure were calculated using the evaluation script provided by the Center of Informatics for Integrating Biology and the Bedside/VA challenge organizers. The overall performance for all three types of clinical entities and all six types of assertions across 477 annotated notes were considered as the primary metric in the challenge.Results and discussion Systematic evaluation on the training set showed that Conditional Random Fields outperformed Support Vector Machines, and semantic information from existing natural-language-processing systems largely improved performance, although contributions from different types of features varied. The authors' hybrid entity extraction system achieved a maximum overall F-score of 0.8391 for concept extraction (ranked second) and 0.9313 for assertion classification (ranked fourth, but not statistically different than the first three systems) on the test data set in the challenge.

Extracting clinical concepts from user queries

CliNER 2.0: Accessible and Accurate Clinical Concept Extraction

Clinical Concept Extraction with Contextual Word Embedding

Clinical concept extraction: A methodology review

Enhancing Clinical Concept Extraction with Contextual Embeddings

Research of Clinical Named Entity Recognition Based on Bi-LSTM-CRF

Applying unsupervised keyphrase methods on concepts extracted from discharge sheets

Exploiting the concept level feature for enhanced name entity recognition in Chinese EMRs

Clinical concept and relation extraction using prompt-based machine reading comprehension

A Discrete Joint Model for Entity and Relation Extraction from Clinical Notes

Query Classification by Leveraging Explicit Concept Information

Exploiting Collaborative Learning for Concept Extraction in the Medical Field.

Character-level Neural Network Model Based on Nadam Optimization and Its Application in Clinical Concept Extraction

NEAR: Named Entity and Attribute Recognition of clinical concepts

Conceptual annotation of Chinese queries from customers on the world wide Web

Clinical Text Generation through Leveraging Medical Concept and Relations

A Study of Machine-Learning-based Approaches to Extract Clinical Entities and Their Assertions from Discharge Summaries

Clinical Named Entity Recognition using Contextualized Token Representations

Automated concept-level information extraction to reduce the need for custom software and rules development

Long Concept Query on Conceptual Taxonomies

An Empirical Study of UMLS Concept Extraction from Clinical Notes using Boolean Combination Ensembles