Abstract:Background: Medical personnel are expected to parse through scores of reports each day, covering the medical history of their patients. This reading task is crucial to the effectiveness of the healthcare provided. However, it has been noticed that doctors often have to spend a lot of time going through these documents, in order to get a concise gist of the most medically relevant details. This could even affect the amount of time left for doctor-patient interaction. It is in this scenario, that the potential usefulness of an automatic clinical report summarization tool becomes apparent. Such a system would save a lot of effort for the doctor, and make a lot of time available for quality patient-doctor interaction. The focus of this paper is on extractive summarization. Method: Due to its vast pre-training, BERT (Bidirectional Encoder Representations from Transformers) is one of the most knowledgeable NLP (Natural Language Processing) models currently available- making it one of the best choices for a task like summarization. BERTSUM is the BERT version fine-tuned for summarization, BERTSUMEXT being the extractive summarization variant. The BERTSUMEXT architecture has previously been used to create a model that has been extensively pre-trained on the CNN/DailyMail dataset of news articles and corresponding summaries. It was noticed through testing that this pre-trained version of BERTSUMEXT does not perform very well on clinical reports and therefore needs to be improved to be employed in a clinical report summarization system. The method adopted here is to further train the BERTSUMEXT model using different training strategies on a clinical report summarization dataset and assess the performance improvement. The idea is to expand BERTSUMEXT’s knowledge to give it a ‘medical edge’ that it lacks. Results: The training strategy that modifies the parameter values of the extractive summarization layers of the BERTSUMEXT architecture shows a clear improvement on all nine parameters of the ROUGE (Recall Oriented Understudy for Gisting Evalution) automatic evaluation metric and the human evaluation paradigm. The ROUGE metric evaluates summary quality by measuring the overlap between the reference gold summary and the candidate summary generated by the model. The Human Evaluation Paradigm is a method where we obtain a professional doctor’s opinion on the summary quality produced by the model.

Biomedical-domain Pre-Trained Language Model for Extractive Summarization

Improving Biomedical Abstractive Summarisation with Knowledge Aggregation from Citation Papers

Enhancing Biomedical Text Summarization and Question-Answering: On the Utility of Domain-Specific Pre-Training

Text Summarization with Pretrained Encoders

COVIDSum: A Linguistically Enriched SciBERT-based Summarization Model for COVID-19 Scientific Papers.

Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing

BioBERT: a pre-trained biomedical language representation model for biomedical text mining

SuMe: A Dataset Towards Summarizing Biomedical Mechanisms

Two-stage Encoding Extractive Summarization

T-BERTSum: Topic-Aware Text Summarization Based on BERT

BioMamba: A Pre-trained Biomedical Language Representation Model Leveraging Mamba

A Text Abstraction Summary Model Based on BERT Word Embedding and Reinforcement Learning

Improving Biomedical Pretrained Language Models with Knowledge

ChestXRayBERT: A Pretrained Language Model for Chest Radiology Report Summarization

Fine-tuning the BERTSUMEXT model for Clinical Report Summarization

RadBARTsum: Domain Specific Adaption of Denoising Sequence-to-Sequence Models for Abstractive Radiology Report Summarization

BioBART: Pretraining and Evaluation of A Biomedical Generative Language Model