Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven Cloze Reward

Luyang Huang,Lingfei Wu,Lu Wang
DOI: https://doi.org/10.48550/arXiv.2005.01159
2020-05-04
Abstract:Sequence-to-sequence models for abstractive summarization have been studied extensively, yet the generated summaries commonly suffer from fabricated content, and are often found to be near-extractive. We argue that, to address these issues, the summarizer should acquire semantic interpretation over input, e.g., via structured representation, to allow the generation of more informative summaries. In this paper, we present ASGARD, a novel framework for Abstractive Summarization with Graph-Augmentation and semantic-driven RewarD. We propose the use of dual encoders---a sequential document encoder and a graph-structured encoder---to maintain the global context and local characteristics of entities, complementing each other. We further design a reward based on a multiple choice cloze test to drive the model to better capture entity interactions. Results show that our models produce significantly higher ROUGE scores than a variant without knowledge graph as input on both New York Times and CNN/Daily Mail datasets. We also obtain better or comparable performance compared to systems that are fine-tuned from large pretrained language models. Human judges further rate our model outputs as more informative and containing fewer unfaithful errors.
Computation and Language
What problem does this paper attempt to address?
The problems that this paper attempts to solve are the two main problems existing in the existing abstractive automatic summarization generation models: 1. **Unfaithful Generated Content**: When generating summaries, the existing sequence - to - sequence models often produce content that does not match the original text, namely the so - called "fabricated content". Such unfaithful content will affect the quality and reliability of the summary. 2. **Generated Summaries Are Similar to Extractive Summaries**: Although these models aim to generate abstractive summaries, the generated summaries are often too close to the sentences in the original text, lacking in - depth understanding and semantic interpretation of the input text, resulting in summaries lacking in information and innovation. To address these problems, the paper proposes a new framework - ASGARD (Abstractive Summarization with Graph - Augmentation and semantic - driven RewarD), which improves summary generation through the following methods: - **Using a Dual - Encoder Structure**: A sequential document encoder and a graph - structure encoder are introduced. The former is used to capture the global context of the document, and the latter is used to capture the local features of entities and their interactions. These two encoders are complementary and jointly improve the quality of the summary. - **Designing a Multiple - Choice - Based Cloze - Test Reward Mechanism**: By designing a cloze - test task in the form of multiple - choice questions, reinforcement learning is used to guide the model to better understand the semantics of the input content, thereby generating more faithful and information - rich summaries. Through these methods, the paper aims to generate more faithful, information - rich and innovative summaries, thereby improving the quality and practicality of the summaries.