Abstract:Abstracts derived from biomedical literature possess distinct domain-specific characteristics, including specialised writing styles and biomedical terminologies, which necessitate a deep understanding of the related literature. As a result, existing language models struggle to generate technical summaries that are on par with those produced by biomedical experts, given the absence of domain-specific background knowledge. This paper aims to enhance the performance of language models in biomedical abstractive summarisation by aggregating knowledge from external papers cited within the source article. We propose a novel attention-based citation aggregation model that integrates domain-specific knowledge from citation papers, allowing neural networks to generate summaries by leveraging both the paper content and relevant knowledge from citation papers. Furthermore, we construct and release a large-scale biomedical summarisation dataset that serves as a foundation for our research. Extensive experiments demonstrate that our model outperforms state-of-the-art approaches and achieves substantial improvements in abstractive biomedical text summarisation.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to improve the quality of automatic abstract generation in biomedical literature. Specifically, when generating technical abstracts, existing language models, due to the lack of domain - specific background knowledge, are difficult to be comparable to the abstracts generated by biomedical experts. To solve this problem, the paper proposes a new method to enhance the performance of language models in automatic abstract generation of biomedical literature by aggregating knowledge from cited external literature. ### Main contributions of the paper: 1. **Constructing a large - scale biomedical literature dataset**: The paper constructs a large - scale biomedical literature dataset, which contains more than 10,000 instances, with an average of 16 cited references per instance. This dataset is specifically used for research on enhancing biomedical text abstract generation. 2. **Proposing a knowledge - aggregation framework based on cited literature**: The paper introduces an attention - mechanism - based network that can dynamically extract features from the abstracts of cited literature and combine them with the content features of the main literature, thereby generating higher - quality abstracts. 3. **Extensive experimental verification**: The paper verifies the effectiveness of the proposed framework through a large number of experiments. The experimental results show that this model is significantly superior to the existing state - of - the - art models in the task of automatic abstract generation of biomedical literature. ### Specific methods of the paper: 1. **Dataset construction**: The paper uses the open - source biomedical literature corpus provided by the Allen Institute and constructs a structured dataset through strict screening and processing. Each sample contains information about the main literature and its cited literature. 2. **Knowledge - aggregation framework**: The paper proposes an attention - mechanism - based framework that can extract relevant knowledge from cited literature and combine it with the content of the main literature to generate high - quality abstracts. 3. **Model training and evaluation**: The paper uses multiple pre - trained language models (such as BART, PEGASUS, etc.) for experiments and evaluates the generated abstracts through multiple metrics such as ROUGE scores, BERTScore, BartScore, etc. ### Experimental results: - **Performance improvement**: The experimental results show that the proposed framework is significantly superior to the baseline model in ROUGE scores (F1 value), especially with a significant improvement in recall rate. - **Improvement in language quality**: The perplexity (PPL) and ROUGE - L scores of the model when generating abstracts are both reduced, indicating that the model has improved in language quality and reducing confusion in the generation process. - **Ablation study**: The effect of directly introducing a single cited literature is not as expected, while the attention - mechanism - based knowledge - aggregation model can effectively reduce noise and improve model performance. In general, this paper significantly improves the quality of automatic abstract generation of biomedical literature by introducing the knowledge of cited literature, providing new ideas and methods for research in this field.

Improving Biomedical Abstractive Summarisation with Knowledge Aggregation from Citation Papers

Improving Biomedical Abstractive Summarisation with Knowledge Aggregation from Citation Papers

Amplifying Scientific Paper's Abstract by Leveraging Data-Weighted Reconstruction

Enhancing Scientific Papers Summarization with Citation Graph

HITS-based attentional neural model for abstractive summarization

KATSum: Knowledge-aware Abstractive Text Summarization

Synthesizing Scientific Summaries: An Extractive and Abstractive Approach

Methodical Systematic Review of Abstractive Summarization and Natural Language Processing Models for Biomedical Health Informatics: Approaches, Metrics and Challenges

Abstractive Summarization Improved by WordNet-based Extractive Sentences

Improved BIO-based Chinese Automatic Abstract-generation Model

A Supervised Approach to Extractive Summarisation of Scientific Papers

Improving Biomedical Pretrained Language Models with Knowledge

Neural Sequence-to-Sequence Modeling with Attention by Leveraging Deep Learning Architectures for Enhanced Contextual Understanding in Abstractive Text Summarization

uMedSum: A Unified Framework for Advancing Medical Abstractive Summarization

Integrating Topic-Aware Heterogeneous Graph Neural Network With Transformer Model for Medical Scientific Document Abstractive Summarization

MedicalSum: A Guided Clinical Abstractive Summarization Model for Generating Medical Reports from Patient-Doctor Conversations

SuMe: A Dataset Towards Summarizing Biomedical Mechanisms

Abstractive summarization incorporating graph knowledge

Mind The Facts: Knowledge-Boosted Coherent Abstractive Text Summarization

Automated Lay Language Summarization of Biomedical Scientific Reviews

Enhancing Abstractive Dialogue Summarization with Internal Knowledge