Abstract:In this paper, we introduce and tackle the Outline Generation (OG) task, which aims to unveil the inherent content structure of a multi-paragraph document by identifying its potential sections and generating the corresponding section headings. Without loss of generality, the OG task can be viewed as a novel structured summarization task. To generate a sound outline, an ideal OG model should be able to capture three levels of coherence, namely the coherence between context paragraphs, that between a section and its heading, and that between context headings. The first one is the foundation for section identification, while the latter two are critical for consistent heading generation. In this work, we formulate the OG task as a hierarchical structured prediction problem, i.e., to first predict a sequence of section boundaries and then a sequence of section headings accordingly. We propose a novel hierarchical structured neural generation model, named HiStGen, for the task. Our model attempts to capture the three-level coherence via the following ways. First, we introduce a Markov paragraph dependency mechanism between context paragraphs for section identification. Second, we employ a section-aware attention mechanism to ensure the semantic coherence between a section and its heading. Finally, we leverage a Markov heading dependency mechanism and a review mechanism between context headings to improve the consistency and eliminate duplication between section headings. Besides, we build a novel Wriptsize IKI OG dataset, a public collection which consists of over 1.75 million document-outline pairs for research on the OG task. Experimental results on our benchmark dataset demonstrate that our model can significantly outperform several state-of-the-art sequential generation models for the OG task.

Summarize, Outline, and Elaborate: Long-Text Generation via Hierarchical Supervision from Extractive Summaries

Long text outline generation: Chinese text outline based on unsupervised framework and large language mode

Abstractive text summarization model combining a hierarchical attention mechanism and multiobjective reinforcement learning

A study of extractive summarization of long documents incorporating local topic and hierarchical information

Write Summary Step-by-Step: A Pilot Study of Stepwise Summarization

GoSum: Extractive Summarization of Long Documents by Reinforcement Learning and Graph Organized discourse state

A New Approach to Overgenerating and Scoring Abstractive Summaries

Generating Multiple-Length Summaries via Reinforcement Learning for Unsupervised Sentence Summarization

Hierarchical Text Generation using an Outline

A Two-Stage Long Text Summarization Method Based on Discourse Structure

EDU-level Extractive Summarization with Varying Summary Lengths

Topic-Guided Abstractive Text Summarization: a Joint Learning Approach

An Unsupervised Extractive Summarization Method Based on Multi-Round Computation

Efficient Two-stage Approach for Long Document Summarization

Query-oriented unsupervised multi-document summarization via deep learning model

Summ^N: A Multi-Stage Summarization Framework for Long Input Dialogues and Documents

Outline Generation: Understanding the Inherent Content Structure of Documents

Abstractive Summarization Guided by Latent Hierarchical Document Structure

Modeling Unified Semantic Discourse Structure for High-quality Headline Generation

Discourse-Aware Unsupervised Summarization of Long Scientific Documents

HIBRIDS: Attention with Hierarchical Biases for Structure-aware Long Document Summarization