Abstract:Previous studies have made great advances in RST discourse parsing through specific neural frameworks or features, but they usually split the parsing process into two subtasks and heavily depended on gold discourse segmentation. In this article, we introduce an end-to-end method for sentence-level RST discourse parsing via transforming it into a text-to-text generation task, which can also be simply applied to document-level parsing. Our method unifies the traditional two-stage parsing and generates the parsing tree directly from the input text through our constrained decoding and postprocessing algorithms, without requiring a complicated model. Moreover, the discourse segmentation can be simultaneously generated and extracted from the parsing tree. Experimental results on the RST Discourse Treebank demonstrate that our proposed method outperforms existing methods in both the tasks of discourse parsing and segmentation. We further carry out ablation studies and more targeted comparisons with traditional patterns to analyze our method in more detail. Considering the lack of annotated data in RST parsing, we also create high-quality augmented data and implement self-training, which further improves the performance of our method.

RST Discourse Parsing As Text-to-Text Generation

DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing

Unleashing the Power of Neural Discourse Parsers -- A Context and Structure Aware Approach Using Large Scale Pretraining

Recursive Deep Models for Discourse Parsing

Cross-lingual RST Discourse Parsing

RST-style Discourse Parsing Guided by Document-level Content Structures

A Two-Stage Parsing Method for Text-Level Discourse Analysis

Joint Syntacto-Discourse Parsing and the Syntacto-Discourse Treebank

Predicting Above-Sentence Discourse Structure Using Distant Supervision from Topic Segmentation

Better Document-level Sentiment Analysis from RST Discourse Parsing

A Top-Down Neural Architecture towards Text-Level Parsing of Discourse Rhetorical Structure

GCDT: A Chinese RST Treebank for Multigenre and Multilingual Discourse Parsing

Exploring Discourse Structure in Document-level Machine Translation

Toward Fast and Accurate Neural Discourse Segmentation

Top-down Discourse Parsing via Sequence Labelling

RSTGen: Imbuing Fine-Grained Interpretable Control into Long-FormText Generators

Transition-Based Discourse Parsing with Multilayer Stack Long Short Term Memory

Why Can't Discourse Parsing Generalize? A Thorough Investigation of the Impact of Data Diversity

Improved Discourse Parsing with Two-Step Neural Transition-Based Model

Extending Automatic Discourse Segmentation for Texts in Spanish to Catalan

Improved Document Modelling with a Neural Discourse Parser