PROM: A Phrase-level Copying Mechanism with Pre-training for Abstractive Summarization

Xinbei Ma,Yeyun Gong,Pengcheng He,Hai Zhao,Nan Duan
DOI: https://doi.org/10.48550/arxiv.2305.06647
2024-01-01
Abstract:Based on the remarkable achievements of pre-trained language models inabstractive summarization, the copying mechanism has proved helpful byimproving the factuality, stability, and overall performance. This workproposes PROM, a new PhRase-level cOpying Mechanism that enhances attention onn-grams, which can be applied to zero-shot summarization with pre-training.PROM adds an indicator layer to explicitly pick up tokens in n-gram that can becopied from the source, and calculates an auxiliary loss for the copyingprediction. Empirical studies show that PROM makes significant improvements infine-tuning on benchmarks. In zero-shot setting, PROM is utilized in theself-supervised pre-training on raw corpora and provides new general baselineson a wide range of summarization datasets. Further analysis shows that PROMperforms more reasonable copying and contributes to faithfulness.
What problem does this paper attempt to address?