Abstract:Background: Abbreviations are considered an essential part of the clinical narrative; they are used not only to save time and space but also to hide serious or incurable illnesses. Misreckoning interpretation of the clinical abbreviations could affect different aspects concerning patients themselves or other services like clinical support systems. There is no consensus in the scientific community to create new abbreviations, making it difficult to understand them. Disambiguate clinical abbreviations aim to predict the exact meaning of the abbreviation based on context, a crucial step in understanding clinical notes. Objectives: Disambiguating clinical abbreviations is an essential task in information extraction from medical texts. Deep contextualized representations models showed promising results in most word sense disambiguation tasks. In this work, we propose a one-fits-all classifier to disambiguate clinical abbreviations with deep contextualized representation from pretrained language models like Bidirectional Encoder Representation from Transformers (BERT). Methods: A set of experiments with different pretrained clinical BERT models were performed to investigate fine-tuning methods on the disambiguation of clinical abbreviations. One-fits-all classifiers were used to improve disambiguating rare clinical abbreviations. Results: One-fits-all classifiers with deep contextualized representations from Bioclinical, BlueBERT, and MS_BERT pretrained models improved the accuracy using the University of Minnesota data set. The model achieved 98.99, 98.75, and 99.13%, respectively. All the models outperform the state-of-the-art in the previous work of around 98.39%, with the best accuracy using the MS_BERT model. Conclusion: Deep contextualized representations via fine-tuning of pretrained language modeling proved its sufficiency on disambiguating clinical abbreviations; it could be robust for rare and unseen abbreviations and has the advantage of avoiding building a separate classifier for each abbreviation. Transfer learning can improve the development of practical abbreviation disambiguation systems.

MeDAL: Medical Abbreviation Disambiguation Dataset for Natural Language Understanding Pretraining

A Neural Topic-Attention Model for Medical Term Abbreviation Disambiguation

Training without training data: Improving the generalizability of automated medical abbreviation disambiguation

Automatically disambiguating medical acronyms with ontology-aware deep learning

Deep Contextualized Biomedical Abbreviation Expansion

Leveraging Large Language Models for Clinical Abbreviation Disambiguation

Disambiguating Clinical Abbreviations by One-to-All Classification: Algorithm Development and Validation Study

Disambiguation of Medical Abbreviations for Knowledge Organization.

MedCare: Advancing Medical LLMs through Decoupling Clinical Alignment and Knowledge Aggregation

Improving clinical abbreviation sense disambiguation using attention‐based Bi‐LSTM and hybrid balancing techniques in imbalanced datasets

MedDec: A Dataset for Extracting Medical Decisions from Discharge Summaries

PLOD: An Abbreviation Detection Dataset for Scientific Documents

Interpretable Medical Diagnostics with Structured Data Extraction by Large Language Models

Token Classification for Disambiguating Medical Abbreviations

Exploiting Task-Oriented Resources to Learn Word Embeddings for Clinical Abbreviation Expansion

MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records

Disambiguating Clinical Abbreviations Using a One-Fits-All Classifier Based on Deep Learning Techniques

MultiMed: Multilingual Medical Speech Recognition via Attention Encoder Decoder

MEDDAP: Medical Dataset Enhancement via Diversified Augmentation Pipeline

ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences

MeDaS: An open-source platform as service to help break the walls between medicine and informatics