Discourse Representation Structure Parsing for Chinese

Chunliu Wang,Xiao Zhang,Johan Bos
2023-06-16
Abstract:Previous work has predominantly focused on monolingual English semantic parsing. We, instead, explore the feasibility of Chinese semantic parsing in the absence of labeled data for Chinese meaning representations. We describe the pipeline of automatically collecting the linearized Chinese meaning representation data for sequential-to sequential neural networks. We further propose a test suite designed explicitly for Chinese semantic parsing, which provides fine-grained evaluation for parsing performance, where we aim to study Chinese parsing difficulties. Our experimental results show that the difficulty of Chinese semantic parsing is mainly caused by adverbs. Realizing Chinese parsing through machine translation and an English parser yields slightly lower performance than training a model directly on Chinese data.
Computation and Language
What problem does this paper attempt to address?
The main problem this paper attempts to address is the feasibility of Chinese semantic analysis (particularly Discourse Representation Structures, DRS-based semantic analysis) in the absence of annotated data, and to study the challenges and solutions of Chinese semantic analysis. Specifically, the paper focuses on the following aspects: 1. **Applicability of existing DRS parsing models to Chinese**: Investigate whether existing DRS parsing models can be effectively applied to Chinese, i.e., whether these models can achieve performance comparable to English on Chinese. 2. **Challenges in Chinese semantic analysis**: Analyze the specific difficulties encountered in the process of Chinese semantic analysis, especially the impact of adverbs on parsing performance. 3. **Feasibility of machine translation and English parsers**: Study whether it is possible to translate Chinese text into English and then use English parsers to parse Chinese, and the performance difference between this method and specially developed Chinese parsers. 4. **Fine-grained evaluation methods**: Propose a set of fine-grained evaluation metrics to more accurately assess experimental results and reduce the workload of manual evaluation. To achieve the above goals, the authors used Chinese-English aligned texts from parallel corpora (such as the Parallel Meaning Bank, PMB) and generated Chinese DRS data for training and evaluation through a series of processing steps (including text tokenization, Chinese-English alignment, named entity replacement, etc.). The experimental results show that the difficulty of Chinese semantic analysis is mainly concentrated on the handling of adverbs, and while the method of using machine translation and English parsers is feasible, its performance is slightly lower than that of models trained directly on Chinese data. In addition, the authors proposed a set of fine-grained evaluation metrics to better analyze the performance of different parsers on various subtasks.