Leveraging Salience Analysis and Sparse Attention for Long Document Summarization

Yaxuan Chen,Dongning Rao,Zhihua Jiang
DOI: https://doi.org/10.1145/3639233.3639348
2023-12-15
Abstract:Extractive and abstractive summarization models have led to promising results in summarizing relatively short documents, but still face the challenge from longer-form documents (e.g., scientific papers). Specifically, extractive models produce inaccurate or redundant summaries due to their weak salience analysis, while transformer-based abstractive models suffer from the quadratic dependency on the sequence length for their full attention mechanism. To remedy this, we propose a novel hybrid model named LDSumm (Long Document Summarization), which is composed of an extractive module that enhances the salience analysis by leveraging hierarchical structure (especially section information) of a document, and an abstractive module that introduces sparse attention ideas to increase the input size of BART. We conduct extensive experiments on two scientific-paper datasets: arXiv and PubMed. Experimental results show that LDSumm outperforms the baseline BART and other comparison models and obtains greater gain on the longer-paper dataset arXiv.
Computer Science
What problem does this paper attempt to address?