What problem does this paper attempt to address?

The main problems that this paper attempts to solve are the deficiencies of existing Machine Translation (MT) Quality Estimation (QE) and Automatic Post - Editing (APE) datasets. Specifically, these problems include: 1. **Lack of transparent MT models**: Existing QE methods cannot access the internal state or confidence information of the MT system that generates translations, which limits the application of so - called "glass - box" methods. 2. **Single - mode quality assessment**: Current datasets are either based on direct human assessment or on the differences between translations and post - edited texts (such as through HTER or marking words as OK/BAD), but do not include both assessment methods simultaneously, resulting in an unclear correlation between the two. 3. **Uneven resource distribution**: Most existing datasets are concentrated on high - resource language pairs, for which the translation quality is usually high, while there is less data for medium - and low - resource language pairs. In fact, these language pairs need QE assistance more. 4. **Domain limitations**: Existing datasets are mostly concentrated in specific domains (such as IT or life sciences) and use domain - specific MT models for translation, which may lead to high - quality translations of most sentences and thus it is difficult to reflect the challenges in real - life scenarios. To solve the above problems, the authors introduced the MLQE - PE dataset, which is a multilingual quality - assessment and automatic - post - editing dataset, aiming to overcome the limitations of existing datasets and provide more comprehensive and diverse data support for researchers. ### Features of the MLQE - PE dataset - **Open NMT models**: It provides state - of - the - art Neural Machine Translation (NMT) models used for generating translations, allowing researchers to use the model's uncertainty or internal state for quality assessment. - **Combination of two assessment methods**: It includes both Direct Assessment (DA) and Post - Editing Effort (HTER), so that translation quality can be measured from different perspectives. - **Document - level context**: It contains the Wikipedia article titles where the original sentences are located, allowing for consideration of document - level context when predicting sentence - level or word - level translation quality. - **Coverage of multiple language pairs**: It includes 11 language pairs, covering high - resource, medium - resource, and low - resource language pairs to ensure data diversity and wide applicability. Through these improvements, the MLQE - PE dataset provides more abundant and comprehensive data support for research on machine - translation quality assessment and automatic post - editing.

MLQE-PE: A Multilingual Quality Estimation and Post-Editing Dataset

QUAK: A Synthetic Quality Estimation Dataset for Korean-English Neural Machine Translation

From Handcrafted Features to LLMs: A Brief Survey for Machine Translation Quality Estimation

Pushing the Right Buttons: Adversarial Evaluation of Quality Estimation

Rethink about the Word-level Quality Estimation for Machine Translation from Human Judgement

Unsupervised Quality Estimation for Neural Machine Translation

MQM-APE: Toward High-Quality Error Annotation Predictors with Automatic Post-Editing in LLM Translation Evaluators

Together We Can: Multilingual Automatic Post-Editing for Low-Resource Languages

Self-Supervised Quality Estimation for Machine Translation.

Practical Perspectives on Quality Estimation for Machine Translation

Post-editese: an Exacerbated Translationese

The Machine Translation Post-Editing Annotation System (MTPEAS)

Information Dropping Data Augmentation for Machine Translation Quality Estimation

APE at Scale and its Implications on MT Evaluation Biases

"A Little is Enough": Few-Shot Quality Estimation based Corpus Filtering improves Machine Translation

QUEST: Quality-Aware Metropolis-Hastings Sampling for Machine Translation

QE-EBM: Using Quality Estimators as Energy Loss for Machine Translation

Qualitative: Python Tool for MT Quality Estimation Supporting Server Mode and Hybrid MT

Mismatching-aware unsupervised translation quality estimation for low-resource languages

MTUncertainty: Assessing the Need for Post-editing of Machine Translation Outputs by Fine-tuning OpenAI LLMs