Unsupervised Inference of Data-Driven Discourse Structures using a Tree Auto-Encoder

Patrick Huber,Giuseppe Carenini
DOI: https://doi.org/10.48550/arXiv.2210.09559
2022-10-18
Abstract:With a growing need for robust and general discourse structures in many downstream tasks and real-world applications, the current lack of high-quality, high-quantity discourse trees poses a severe shortcoming. In order the alleviate this limitation, we propose a new strategy to generate tree structures in a task-agnostic, unsupervised fashion by extending a latent tree induction framework with an auto-encoding objective. The proposed approach can be applied to any tree-structured objective, such as syntactic parsing, discourse parsing and others. However, due to the especially difficult annotation process to generate discourse trees, we initially develop such method to complement task-specific models in generating much larger and more diverse discourse treebanks.
Computation and Language
What problem does this paper attempt to address?