Prediction of Translation Techniques for the Translation Process

Fan Zhou,Vincent Vandeghinste
2024-03-21
Abstract:Machine translation (MT) encompasses a variety of methodologies aimed at enhancing the accuracy of translations. In contrast, the process of human-generated translation relies on a wide range of translation techniques, which are crucial for ensuring linguistic adequacy and fluency. This study suggests that these translation techniques could further optimize machine translation if they are automatically identified before being applied to guide the translation process effectively. The study differentiates between two scenarios of the translation process: from-scratch translation and post-editing. For each scenario, a specific set of experiments has been designed to forecast the most appropriate translation techniques. The findings indicate that the predictive accuracy for from-scratch translation reaches 82%, while the post-editing process exhibits even greater potential, achieving an accuracy rate of 93%.
Computation and Language
What problem does this paper attempt to address?
This paper mainly discusses how to predict the translation techniques applicable during the machine translation process to improve translation quality and accuracy. The study distinguishes between two translation scenarios: translation from scratch and post-editing. By analyzing the source sentence and low-quality machine translation results, specific experiments were designed to predict the most suitable translation techniques. The experimental results show that for translation from scratch, the prediction accuracy reaches 82%, while in post-editing, this proportion increases to 93%. The research found that inappropriate translation techniques are the main cause of low-quality translation. The paper proposes that translation techniques can be used as a guide to optimize machine translation and these techniques can also serve as hints for generating high-quality translation with large-scale language models. The experiments demonstrate that pre-trained cross-lingual language models can effectively predict translation techniques after fine-tuning. The paper introduces related work, including improvements in neural machine translation (NMT) architecture, automatic post-editing, and the impact of translation techniques on translation problems. In the data section, a manually verified English-Chinese parallel corpus is used, and feature extraction and annotation are employed to provide data support for the experiments. In the experimental part, four different architectures were adopted in the paper, targeting translation from scratch and post-editing tasks, using various pre-training models such as BERT, BART, and mT5. The experimental results show that the models have a certain level of accuracy in predicting translation techniques, especially in the post-editing task. The conclusion points out that predicting translation techniques contributes to improving translation quality and fluency, but current work mainly focuses on pre-translation prediction. Future research will focus on how to integrate these techniques into NMT systems to generate more accurate translation. At the same time, the paper also highlights the challenges of data acquisition and sentence alignment, as well as the necessity of automated alignment methods.