Harmonising the Clinical Melody: Tuning Large Language Models for Hospital Course Summarisation in Clinical Coding

Bokang Bi,Leibo Liu,Sanja Lujic,Louisa Jorm,Oscar Perez-Concha

2024-09-24

Abstract:The increasing volume and complexity of clinical documentation in Electronic Medical Records systems pose significant challenges for clinical coders, who must mentally process and summarise vast amounts of clinical text to extract essential information needed for coding tasks. While large language models have been successfully applied to shorter summarisation tasks in recent years, the challenge of summarising a hospital course remains an open area for further research and development. In this study, we adapted three pre trained LLMs, Llama 3, BioMistral, Mistral Instruct v0.1 for the hospital course summarisation task, using Quantized Low Rank Adaptation fine tuning. We created a free text clinical dataset from MIMIC III data by concatenating various clinical notes as the input clinical text, paired with ground truth Brief Hospital Course sections extracted from the discharge summaries for model training. The fine tuned models were evaluated using BERTScore and ROUGE metrics to assess the effectiveness of clinical domain fine tuning. Additionally, we validated their practical utility using a novel hospital course summary assessment metric specifically tailored for clinical coding. Our findings indicate that fine tuning pre trained LLMs for the clinical domain can significantly enhance their performance in hospital course summarisation and suggest their potential as assistive tools for clinical coding. Future work should focus on refining data curation methods to create higher quality clinical datasets tailored for hospital course summary tasks and adapting more advanced open source LLMs comparable to proprietary models to further advance this research.

Computation and Language,Machine Learning

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that in the Electronic Medical Record (EMR) system, the increasing quantity and complexity of clinical documents pose a huge challenge to clinical coders. These coders need to process and summarize a large amount of clinical text in their minds to extract the key information required for coding tasks. Although large - language models (LLMs) have been successfully applied to the summarization tasks of shorter texts in recent years, the task of hospital course summarization remains an open area of research and development. Therefore, this study aims to improve the performance of these models in the hospital course summarization task by fine - tuning three pre - trained large - language models (Llama 3, BioMistral, Mistral Instruct v0.1) using the Quantized Low - Rank Adaptation (QLoRA) technique, especially to meet the needs of clinical coding. In addition, the study also develops a new evaluation index for hospital course summarization, which is specifically verified for the practical application effect of clinical coding. Through this study, the author hopes to explore how to fine - tune large - language models with domain - specific data so that they can better generate hospital course summaries that meet the needs of clinical coding, thereby improving the efficiency and accuracy of clinical coding work.

Harmonising the Clinical Melody: Tuning Large Language Models for Hospital Course Summarisation in Clinical Coding

Adapting Large Language Models for Automated Summarisation of Electronic Medical Records in Clinical Coding

A Dataset and Benchmark for Hospital Course Summarization with Adapted Large Language Models

Enhanced Electronic Health Records Text Summarization Using Large Language Models

Sexual hormone fluctuation in chinchillas.

Adapted large language models can outperform medical experts in clinical text summarization

Fine-tuning Large Language Models for Automated Diagnostic Screening Summaries

How Long Is Enough? Exploring the Optimal Intervals of Long-Range Clinical Note Language Modeling

A Comparative Study of Recent Large Language Models on Generating Hospital Discharge Summaries for Lung Cancer Patients

Towards Evaluating and Building Versatile Large Language Models for Medicine

A Framework to Assess Clinical Safety and Hallucination Rates of LLMs for Medical Text Summarisation

Evaluation of Large Language Models for Summarization Tasks in the Medical Domain: A Narrative Review

The Sound of Healthcare: Improving Medical Transcription ASR Accuracy with Large Language Models

Exploring the Efficacy of Large Language Models in Summarizing Mental Health Counseling Sessions: Benchmark Study

Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs

Does Biomedical Training Lead to Better Medical Performance?

Towards Democratizing Multilingual Large Language Models For Medicine Through A Two-Stage Instruction Fine-tuning Approach

Large language models encode clinical knowledge

Closing the gap between open-source and commercial large language models for medical evidence summarization

Comparative Analysis of Open-Source Language Models in Summarizing Medical Text Data