Self-Attention Guided Copy Mechanism for Abstractive Summarization.

Song Xu,Haoran Li,Peng Yuan,Youzheng Wu,Xiaodong He,Bowen Zhou
DOI: https://doi.org/10.18653/v1/2020.acl-main.125
2020-01-01
Abstract:Copy module has been widely equipped in the recent abstractive summarization models, which facilitates the decoder to extract words from the source into the summary. Generally, the encoder-decoder attention is served as the copy distribution, while how to guarantee that important words in the source are copied remains a challenge. In this work, we propose a Transformer-based model to enhance the copy mechanism. Specifically, we identify the importance of each source word based on the degree centrality with a directed graph built by the self-attention layer in the Transformer. We use the centrality of each source word to guide the copy process explicitly. Experimental results show that the self-attention graph provides useful guidance for the copy distribution. Our proposed models significantly outperform the baseline methods on the CNN/Daily Mail dataset and the Gigaword dataset.
What problem does this paper attempt to address?