Multilingual Universal Dependency Parsing from Raw Text with Low-Resource Language Enhancement.

Yingting Wu,Hai Zhao,Jia-Jun Tong
DOI: https://doi.org/10.18653/v1/k18-2007
2018-01-01
Abstract:This paper describes the system of our team Phoenix for participating CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies.Given the annotated gold standard data in CoNLL-U format, we train the tokenizer, tagger and parser separately for each treebank based on an open source pipeline tool UDPipe.Our system reads the plain texts for input, performs the preprocessing steps (tokenization, lemmas, morphology) and finally outputs the syntactic dependencies.For the low-resource languages with no training data, we use cross-lingual techniques to build models with some close languages instead.In the official evaluation, our system achieves the macro-averaged scores of 65.61%, 52.26%, 55.71% for LAS, MLAS and BLEX respectively.
What problem does this paper attempt to address?