NJUNLP's Submission for CCMT20 Quality Estimation Task.

Qu Cui,Xiang Geng,Shujian Huang,Jiajun Chen
DOI: https://doi.org/10.1007/978-981-33-6162-1_11
2020-01-01
Abstract:Quality Estimation is a task to predict the quality of translations without relying on any references. QE systems are based on neural features but suffer from the limited size of QE data. The best models nowadays transfer bilingual knowledge from parallel data to QE tasks. However, the distribution between parallel data and QE data may lead to the value of parallel data not being used for best. More specifically, there are no errors in parallel translations while there may be more than one error in the translations of QE data. To alleviate this problem, we propose a model that will mask some tokens at the target side on parallel data but still need to predict every target token. And based on this model, we propose a variant model that uses a masked language model at the target side to obtain deep bi-directional information. Besides, we also try different ensemble methods to get better performance of the CCMT20 Quality Estimation Task. Our system finally won second place in the ZH-EN language pair and third place in the EN-ZH language pair.
What problem does this paper attempt to address?