RST Discourse Parsing As Text-to-Text Generation

Xinyu Hu,Xiaojun Wan
DOI: https://doi.org/10.1109/taslp.2023.3306710
2023-01-01
Abstract:Previous studies have made great advances in RST discourse parsing through specific neural frameworks or features, but they usually split the parsing process into two subtasks and heavily depended on gold discourse segmentation. In this article, we introduce an end-to-end method for sentence-level RST discourse parsing via transforming it into a text-to-text generation task, which can also be simply applied to document-level parsing. Our method unifies the traditional two-stage parsing and generates the parsing tree directly from the input text through our constrained decoding and postprocessing algorithms, without requiring a complicated model. Moreover, the discourse segmentation can be simultaneously generated and extracted from the parsing tree. Experimental results on the RST Discourse Treebank demonstrate that our proposed method outperforms existing methods in both the tasks of discourse parsing and segmentation. We further carry out ablation studies and more targeted comparisons with traditional patterns to analyze our method in more detail. Considering the lack of annotated data in RST parsing, we also create high-quality augmented data and implement self-training, which further improves the performance of our method.
What problem does this paper attempt to address?