Abstract:We report our work on building linguistic resources and data-driven parsers in the grammatical relation (GR) analysis for Mandarin Chinese. Chinese, as an analytic language, encodes grammatical information in a highly configurational rather than morphological way. Accordingly, it is possible and reasonable to represent almost all grammatical relations as bilexical dependencies. In this work, we propose to represent grammatical information using general directed dependency graphs. Both only-local and rich long-distance dependencies are explicitly represented. To create high-quality annotations, we take advantage of an existing TreeBank, namely, Chinese TreeBank (CTB), which is grounded on the Government and Binding theory. We define a set of linguistic rules to explore CTB’s implicit phrase structural information and build deep dependency graphs. The reliability of this linguistically motivated GR extraction procedure is highlighted by manual evaluation. Based on the converted corpus, data-driven, including graph- and transition-based, models are explored for Chinese GR parsing. For graph-based parsing, a new perspective, graph merging, is proposed for building flexible dependency graphs: constructing complex graphs via constructing simple subgraphs. Two key problems are discussed in this perspective: (1) how to decompose a complex graph into simple subgraphs, and (2) how to combine subgraphs into a coherent complex graph. For transition-based parsing, we introduce a neural parser based on a list-based transition system. We also discuss several other key problems, including dynamic oracle and beam search for neural transition-based parsing. Evaluation gauges how successful GR parsing for Chinese can be by applying data-driven models. The empirical analysis suggests several directions for future study.

The Chinese Discourse Parser

Chinese Discourse Relation Recognition Using Parallel Corpus

Recursive Deep Models for Discourse Parsing

Chinese Discourse Segmentation Using Bilingual Discourse Commonality

Towards Intelligent Policy Analysis: A Discourse Structure Parsing Technique for Chinese Government Document

Unifying Discourse Resources with Dependency Framework

The Cuhk Discourse Treebank for Chinese: Annotating Explicit Discourse Connectives for the Chinese Treebank

Zero-shot Chinese Discourse Dependency Parsing via Cross-lingual Mapping

Discourse Representation Structure Parsing for Chinese

Exploiting Discourse Relations for Sentiment Analysis.

A Pilot Study on Dialogue-Level Dependency Parsing for Chinese

Exploring Chinese and English Discourse Dependency Treebanks

Neural Network Models for Implicit Discourse Relation Classification in English and Chinese without Surface Features

Grammatical Relations In Chinese: Gb-Ground Extraction And Data-Driven Parsing

Text-Level Discourse Dependency Parsing

Cross-Lingual Identification of Ambiguous Discourse Connectives for Resource-Poor Language.

A Survey of Implicit Discourse Relation Recognition

Improve Discourse Dependency Parsing with Contextualized Representations

Parsing Chinese Sentences with Grammatical Relations

Chinese Semantic Dependency Relation System and Treebank Construction.

Research on Discourse Parsing: from the Dependency View