Abstract:Summarizing patient clinical notes is vital for reducing documentation burdens. Current manual summarization makes medical staff struggle. We propose an automatic method using LLMs, but long inputs cause LLMs to lose context, reducing output quality especially in small size model. We used a 7B model, open-calm-7b, enhanced with Native Bayes Context Extend and a redesigned decoding mechanism to reference one sentence at a time, keeping inputs within context windows, 2048 tokens. Our improved model achieved near parity with Google's over 175B Gemini on ROUGE-L metrics with 200 samples, indicating strong performance using less resources, enhancing automated EMR summarization feasibility.

What problem does this paper attempt to address?

The problems that this paper attempts to solve are: **How to optimize the automatic summarization of long - length clinical records, so as to reduce the manual summarization burden on medical staff and improve the quality and efficiency of summarization**. Specifically, the paper focuses on the following aspects of problems: 1. **Inefficiency of manual summarization**: Currently, medical staff need to spend a great deal of time manually summarizing patients' clinical records. This not only increases the workload but may also lead to inaccurate summarization or omission of important information. 2. **Limitations of existing automatic summarization methods**: Although large - language models (LLMs) perform well in text summarization, when dealing with long - length clinical records, due to the limitation of the context window, the model is prone to losing context information, resulting in a decline in output quality. In particular, small - scale LLMs (such as models with a 7B parameter scale), their performance drops significantly when dealing with long - length inputs. 3. **Resource and cost issues**: Ultra - large - scale LLMs on cloud platforms (such as models with more than 175B parameters) can handle longer texts, but their deployment and use are costly, and there are data security and privacy risks. In addition, the hardware resources within hospitals are limited and it is difficult to support such large - scale models. In order to solve these problems, the paper proposes a method based on Dynamic Context Extension (DCE), combined with an improved decoding mechanism, using a smaller - scale LLM (such as the Open - Calm - 7B model with 7B parameters) to achieve efficient automatic summarization of clinical records. Through this method, the paper aims to achieve the following goals: - **Improve summarization quality**: By improving the context - handling mechanism, ensure that the model can still maintain a relatively high summarization quality when dealing with long - length clinical records. - **Reduce costs and resource consumption**: Use a smaller - scale model to reduce dependence on expensive hardware and cloud - computing resources and lower deployment and operation costs. - **Enhance data security and privacy protection**: By deploying the model locally, avoid uploading patient data to the cloud, thereby better protecting data security and privacy. - **Reduce communication latency**: Locally - deployed models can significantly reduce the latency caused by network communication and improve the speed and efficiency of clinical decision - making. Overall, the goal of this paper is to develop an automatic clinical - record - summarization system that is efficient, low - cost, secure and suitable for the actual medical environment.

Optimizing Automatic Summarization of Long Clinical Records Using Dynamic Context Extension:Testing and Evaluation of the NBCE Method

Enhanced Electronic Health Records Text Summarization Using Large Language Models

Adapting Large Language Models for Automated Summarisation of Electronic Medical Records in Clinical Coding

Query-Guided Self-Supervised Summarization of Nursing Notes

Harmonising the Clinical Melody: Tuning Large Language Models for Hospital Course Summarisation in Clinical Coding

Sexual hormone fluctuation in chinchillas.

A Dataset and Benchmark for Hospital Course Summarization with Adapted Large Language Models

Attention-based Clinical Note Summarization

MedicalSum: A Guided Clinical Abstractive Summarization Model for Generating Medical Reports from Patient-Doctor Conversations

SPeC: A Soft Prompt-Based Calibration on Performance Variability of Large Language Model in Clinical Notes Summarization

CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models

AI-Assisted Summarization of Radiologic Reports: Evaluating GPT3davinci, BARTcnn, LongT5booksum, LEDbooksum, LEDlegal, and LEDclinical

Query-Focused EHR Summarization to Aid Imaging Diagnosis

Exploring the Efficacy of Large Language Models in Summarizing Mental Health Counseling Sessions: Benchmark Study

Question-Answering Based Summarization of Electronic Health Records using Retrieval Augmented Generation

Adapted large language models can outperform medical experts in clinical text summarization

Evaluation of Large Language Models for Summarization Tasks in the Medical Domain: A Narrative Review

A Comparative Study of Recent Large Language Models on Generating Hospital Discharge Summaries for Lung Cancer Patients

Enhancing Early Detection of Cognitive Decline in the Elderly: A Comparative Study Utilizing Large Language Models in Clinical Notes