Harmonising the Clinical Melody: Tuning Large Language Models for Hospital Course Summarisation in Clinical Coding

Bokang Bi,Leibo Liu,Sanja Lujic,Louisa Jorm,Oscar Perez-Concha
2024-09-24
Abstract:The increasing volume and complexity of clinical documentation in Electronic Medical Records systems pose significant challenges for clinical coders, who must mentally process and summarise vast amounts of clinical text to extract essential information needed for coding tasks. While large language models have been successfully applied to shorter summarisation tasks in recent years, the challenge of summarising a hospital course remains an open area for further research and development. In this study, we adapted three pre trained LLMs, Llama 3, BioMistral, Mistral Instruct v0.1 for the hospital course summarisation task, using Quantized Low Rank Adaptation fine tuning. We created a free text clinical dataset from MIMIC III data by concatenating various clinical notes as the input clinical text, paired with ground truth Brief Hospital Course sections extracted from the discharge summaries for model training. The fine tuned models were evaluated using BERTScore and ROUGE metrics to assess the effectiveness of clinical domain fine tuning. Additionally, we validated their practical utility using a novel hospital course summary assessment metric specifically tailored for clinical coding. Our findings indicate that fine tuning pre trained LLMs for the clinical domain can significantly enhance their performance in hospital course summarisation and suggest their potential as assistive tools for clinical coding. Future work should focus on refining data curation methods to create higher quality clinical datasets tailored for hospital course summary tasks and adapting more advanced open source LLMs comparable to proprietary models to further advance this research.
Computation and Language,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in the Electronic Medical Record (EMR) system, the increasing quantity and complexity of clinical documents pose a huge challenge to clinical coders. These coders need to process and summarize a large amount of clinical text in their minds to extract the key information required for coding tasks. Although large - language models (LLMs) have been successfully applied to the summarization tasks of shorter texts in recent years, the task of hospital course summarization remains an open area of research and development. Therefore, this study aims to improve the performance of these models in the hospital course summarization task by fine - tuning three pre - trained large - language models (Llama 3, BioMistral, Mistral Instruct v0.1) using the Quantized Low - Rank Adaptation (QLoRA) technique, especially to meet the needs of clinical coding. In addition, the study also develops a new evaluation index for hospital course summarization, which is specifically verified for the practical application effect of clinical coding. Through this study, the author hopes to explore how to fine - tune large - language models with domain - specific data so that they can better generate hospital course summaries that meet the needs of clinical coding, thereby improving the efficiency and accuracy of clinical coding work.