Abstract:Recently, neural abstractive text summarization (NATS) models based on sequence-to-sequence architecture have drawn a lot of attention. Real-world texts that need to be summarized range from short news with dozens of words to long reports with thousands of words. However, most existing NATS models are not good at summarizing long documents, due to the inherent limitations of their underlying neural architectures. In this paper, we focus on the task of long document summarization (LDS). Based on the inherent section structures of source documents, we divide an abstractive LDS problem into several smaller-sized problems. In this circumstance, how to provide a less-biased target summary as the supervision for each section is vital for the model's performance. As a preliminary, we formally describe the section-to-summary-sentence (S2SS) alignment for LDS. Based on this, we propose a novel NATS framework for the LDS task. Our framework is built based on the theory of unbalanced optimal transport (UOT), and it is named as UOTSumm. It jointly learns three targets in a unified training objective, including the optimal S2SS alignment, a section-level NATS summarizer, and the number of aligned summary sentences for each section. In this way, UOTSumm directly learns the text alignment from summarization data, without resorting to any biased tool such as ROUGE. UOTSumm can be easily adapted to most existing NATS models. And we implement two versions of UOTSumm, with and without the pretrain-finetune technique. We evaluate UOTSumm on three publicly available LDS benchmarks: PubMed, arXiv, and GovReport. UOTSumm obviously outperforms its counterparts that use ROUGE for the text alignment. When combined with UOTSumm, the performance of two vanilla NATS models improves by a large margin. Besides, UOTSumm achieves better or comparable performance when compared with some recent strong baselines.

Hierarchical Latent Alignment for Non-Autoregressive Generation under High Compression Ratio.

Monotonic Alignments for Summarization

Unified Training for Cross-Lingual Abstractive Summarization by Aligning Parallel Machine Translation Pairs

Joint learning of text alignment and abstractive summarization for long documents via unbalanced optimal transport

HAAN: Learning a Hierarchical Adaptive Alignment Network for Image-Text Retrieval

Combining Multiple Alignments to Improve Machine Translation.

Hybrid Alignment Training for Large Language Models

Abstractive Summarization Guided by Latent Hierarchical Document Structure

HANet: Hierarchical Alignment Networks for Video-Text Retrieval

Hie-Transformer: A Hierarchical Hybrid Transformer For Abstractive Article Summarization

Jointly Learning to Align and Summarize for Neural Cross-Lingual Summarization

AlignSum: Data Pyramid Hierarchical Fine-tuning for Aligning with Human Summarization Preference

On The Alignment Problem In Multi-Head Attention-Based Neural Machine Translation

Recurrent Alignment with Hard Attention for Hierarchical Text Rating

Diversify and Combine: Improving Word Alignment for Machine Translation on Low-Resource Languages.

Cascade Reward Sampling for Efficient Decoding-Time Alignment

Not Everything is All You Need: Toward Low-Redundant Optimization for Large Language Model Alignment

Structural Supervision for Word Alignment and Machine Translation.

A Hierarchical Neural Abstractive Summarization with Self-Attention Mechanism

Neural Abstractive Summarization with Structural Attention

Fuzzy Alignments in Directed Acyclic Graph for Non-Autoregressive Machine Translation