Adapting Large Language Models for Automated Summarisation of Electronic Medical Records in Clinical Coding

Bokang Bi,Leibo Liu,Oscar Perez-Concha
DOI: https://doi.org/10.3233/SHTI240886
2024-09-24
Abstract:Encapsulating a patient's clinical narrative into a condensed, informative summary is indispensable to clinical coding. The intricate nature of the clinical text makes the summarisation process challenging for clinical coders. Recent developments in large language models (LLMs) have shown promising performance in clinical text summarisation, particularly in radiology and echocardiographic reports, after adaptation to the clinical domain. To explore the summarisation potential of clinical domain adaptation of LLMs, a clinical text dataset, consisting of electronic medical records paired with "Brief Hospital Course" from the MIMIC-III database, was curated. Two open-source LLMs were then fine-tuned, one pre-trained on biomedical datasets and another on a general-content domain on the curated clinical dataset. The performance of the fine-tuned models against their base models were evaluated. The model pre-trained on biomedical data demonstrated superior performance after clinical domain adaptation. This finding highlights the potential benefits of adapting LLMs pre-trained on a related domain over a more generalised domain and suggests the possible role of clinically adapted LLMs as an assistive tool for clinical coders. Future work should explore adapting more advanced models to enhance model performance in higher-quality clinical datasets.
What problem does this paper attempt to address?