A Unified Document-Level Chinese Discourse Parser on Different Granularity Levels.

Weihao Liu,Feng Jiang,Yaxin Fan,Xiaomin Chu,Peifeng Li,Qiaoming Zhu
DOI: https://doi.org/10.1007/978-3-031-41676-7_17
2023-01-01
Abstract:Discourse parsing aims to comprehend the structure and semantics of a document. Some previous studies have taken multiple levels of granularity methods to parse documents while disregarding the connection between granularity levels. Additionally, almost all the Chinese discourse parsing approaches concentrated on a single granularity due to lacking annotated corpora. To address the above issues, we propose a unified document-level Chinese discourse parser based on multi-granularity levels, which leverages granularity connections between paragraphs and Elementary Discourse Units (EDUs) in a document. Specifically, we first identify EDU-level discourse trees and then introduce a structural encoding module to capture EDU-level structural and semantic information. It can significantly promote the construction of paragraph-level discourse trees. Moreover, we construct the Unified Chinese Discourse TreeBank (UCDTB), which includes 467 articles with annotations from clauses to the whole article, filling the gap in existing unified corpus resources on Chinese discourse parsing. The experiments on both Chinese UCDTB and English RST-DT show that our model outperforms the SOTA baselines.
What problem does this paper attempt to address?