Paraphrase and Parallel Treebank for the Comparison of French and Chinese Syntax

Rafael Poiret,Simon Mille,Haitao Liu
DOI: https://doi.org/10.1075/lic.20002.poi
2021-01-01
Languages in Contrast
Abstract:This paper proposes to study the contrastive syntax of French and Chinese through the lens of syntactic mismatches, and by making use of parallel treebanks. A syntactic mismatch is the non-similarity between the syntactic structures of one linguistic unit and its translation. Syntactic mismatches are formalized using the notion of paraphrase from the Meaning-Text Theory, which allows for capturing mismatches at different levels of the linguistic description (e.g. Semantic, Deep-Syntactic, and Surface-Syntactic). In this paper, we report in details on the types of paraphrases found in the seed corpus used, demonstrating that the Deep-Syntactic paraphrases constitute the best starting point for our study. Then, we show how, starting from the seed corpus, we semi-automatically constructed a multi-layer parallel tree-bank with the alignment and annotation of paraphrases.
What problem does this paper attempt to address?