COMET: A Neural Framework for MT Evaluation

Ricardo Rei,Craig Stewart,Ana C Farinha,Alon Lavie
DOI: https://doi.org/10.48550/arXiv.2009.09025
2020-09-18
Computation and Language
Abstract:We present COMET, a neural framework for training multilingual machine translation evaluation models which obtains new state-of-the-art levels of correlation with human judgements. Our framework leverages recent breakthroughs in cross-lingual pretrained language modeling resulting in highly multilingual and adaptable MT evaluation models that exploit information from both the source input and a target-language reference translation in order to more accurately predict MT quality. To showcase our framework, we train three models with different types of human judgements: Direct Assessments, Human-mediated Translation Edit Rate and Multidimensional Quality Metrics. Our models achieve new state-of-the-art performance on the WMT 2019 Metrics shared task and demonstrate robustness to high-performing systems.
What problem does this paper attempt to address?