Iterative Autoregressive Generation for Abstractive Summarization

Jiaxin Duan,Fengyu Lu,Junfei Liu
DOI: https://doi.org/10.1109/icassp48485.2024.10448387
2024-01-01
Abstract:Abstractive summarization suffers from exposure bias caused by the teacher-forced maximum likelihood estimation (MLE) learning, that an autoregressive language model predicts the next token distribution conditioned on the exact pre-context during training while on its own predictions at inference. Preceding resolutions for this problem straightforwardly augment the pure token-level MLE with summary-level objectives. Although this method effectively exposes a model to its prediction errors during summary-level learning, such errors accumulate in the unidirectional autoregressive generation and further limit the learning efficiency. To address this problem, we imitate the human behavior of revising a manuscript multiple times after writing it and introduce a novel iterative autoregressive summarization (IARSum) paradigm, which iteratively rewrites a generated summary to approximate an errorless version. Concretely, IARSum performs iterative revisions after summarization, where the output of the previous revision is taken as the input for the next, and a minimum-risk training strategy is used to ensure that the original summary is effectively polished in every revision round. We conduct extensive experiments on two widely used datasets and show the new or matched state-of-the-art performance of IARSum.
What problem does this paper attempt to address?