A Novel Attention Mechanism Considering Decoder Input for Abstractive Text Summarization

Jianwei Niu,Mingsheng Sun,Joel J. P. C. Rodrigues,Xuefeng Liu
DOI: https://doi.org/10.1109/icc.2019.8762040
2019-01-01
Abstract:Recently, the automatic text summarization has been widely used in text compression tasks. The Attention mechanism is one of the most popular methods used in the seq2seq (Sequence to Sequence) text summarization models. The current attention mechanisms usually use the hidden states of the encoder and the decoder to generate attention distributions. However, they ignore the information of the word waiting to be input into the decoder, leading to possible failures to obtain accurate attention distributions. In this work, we propose a novel attention mechanism further adding the decoder inputs into the operation of generating attention distributions. To our best knowledge, this is the first time that the decoder input has been added to the process of calculating the attention vector. The attention mechanism we proposed to generate the attention distributions considers context similarities as well as semantic similarities, which is closer to the behavior of the human summarizer. We also applied our attention mechanism to the seq2seq based summarization model and trained it on a large corpus containing hundreds of thousands of article-summary pairs. The experimental results on two summarization datasets demonstrate that our attention mechanism outperforms the existing well-known ones. For the popular evaluation metric of the text summarization, our method obtains a 2.93 ROUGE-2 score relative gain compared with the popular attention mechanism Bahdanau Attention, and a 2.21 ROUGE-2 score improvement compared with the best baseline method Luong Attention.
What problem does this paper attempt to address?