Abstract:Clinical abbreviation disambiguation is a crucial task in the biomedical domain, as the accurate identification of the intended meanings or expansions of abbreviations in clinical texts is vital for medical information retrieval and analysis. Existing approaches have shown promising results, but challenges such as limited instances and ambiguous interpretations persist. In this paper, we propose an approach to address these challenges and enhance the performance of clinical abbreviation disambiguation. Our objective is to leverage the power of Large Language Models (LLMs) and employ a Generative Model (GM) to augment the dataset with contextually relevant instances, enabling more accurate disambiguation across diverse clinical contexts. We integrate the contextual understanding of LLMs, represented by BlueBERT and Transformers, with data augmentation using a Generative Model, called Biomedical Generative Pre-trained Transformer (BIOGPT), that is pretrained on an extensive corpus of biomedical literature to capture the intricacies of medical terminology and context. By providing the BIOGPT with relevant medical terms and sense information, we generate diverse instances of clinical text that accurately represent the intended meanings of abbreviations. We evaluate our approach on the widely recognized CASI dataset, carefully partitioned into training, validation, and test sets. The incorporation of data augmentation with the GM improves the model's performance, particularly for senses with limited instances, effectively addressing dataset imbalance and challenges posed by similar concepts. The results demonstrate the efficacy of our proposed method, showcasing the significance of LLMs and generative techniques in clinical abbreviation disambiguation. Our model achieves a good accuracy on the test set, outperforming previous methods.

Term Candidate Generation to Enrich Clinical Terminologies with Large Language Models

Embedding-based terminology expansion via secondary use of large clinical real-world datasets

Can Large Language Models abstract Medical Coded Language?

Large language models are good medical coders, if provided with tools

Viability of Open Large Language Models for Clinical Documentation in German Health Care: Real-World Model Evaluation Study

End-To-End Clinical Trial Matching with Large Language Models

A tool for mapping medical narratives into medical ontologies in low resource settings: A case study for German

Automated clinical coding using off-the-shelf large language models

Almanac: Retrieval-Augmented Language Models for Clinical Medicine

Leveraging Large Language Models for Medical Information Extraction and Query Generation

Discovering Related Clinical Concepts Using Large Amounts of Clinical Notes

Large Language Models with Retrieval-Augmented Generation for Zero-Shot Disease Phenotyping

ClinicalAgent: Clinical Trial Multi-Agent System with Large Language Model-based Reasoning

What Does Palliative Care Mean in Prenatal Diagnosis of Congenital Heart Disease?

Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding

ClinicalGPT: Large Language Models Finetuned with Diverse Medical Data and Comprehensive Evaluation

Scaling Clinical Trial Matching Using Large Language Models: A Case Study in Oncology

Learning to match patients to clinical trials using large language models

Leveraging Large Language Models for Clinical Abbreviation Disambiguation

Large Language Models are Clinical Reasoners: Reasoning-Aware Diagnosis Framework with Prompt-Generated Rationales

Retrieval-Reasoning Large Language Model-based Synthetic Clinical Trial Generation