Alibaba-Translate China's Submission for WMT 2022 Metrics Shared Task

Yu Wan,Keqin Bao,Dayiheng Liu,Baosong Yang,Derek F. Wong,Lidia S. Chao,Wenqiang Lei,Jun Xie
DOI: https://doi.org/10.48550/arXiv.2210.09683
2022-10-18
Computation and Language
Abstract:In this report, we present our submission to the WMT 2022 Metrics Shared Task. We build our system based on the core idea of UNITE (Unified Translation Evaluation), which unifies source-only, reference-only, and source-reference-combined evaluation scenarios into one single model. Specifically, during the model pre-training phase, we first apply the pseudo-labeled data examples to continuously pre-train UNITE. Notably, to reduce the gap between pre-training and fine-tuning, we use data cropping and a ranking-based score normalization strategy. During the fine-tuning phase, we use both Direct Assessment (DA) and Multidimensional Quality Metrics (MQM) data from past years' WMT competitions. Specially, we collect the results from models with different pre-trained language model backbones, and use different ensembling strategies for involved translation directions.
What problem does this paper attempt to address?