Translating Step-by-Step: Decomposing the Translation Process for Improved Translation Quality of Long-Form Texts

Eleftheria Briakou,Jiaming Luo,Colin Cherry,Markus Freitag
2024-09-11
Abstract:In this paper we present a step-by-step approach to long-form text translation, drawing on established processes in translation studies. Instead of viewing machine translation as a single, monolithic task, we propose a framework that engages language models in a multi-turn interaction, encompassing pre-translation research, drafting, refining, and proofreading, resulting in progressively improved translations. Extensive automatic evaluations using Gemini 1.5 Pro across ten language pairs show that translating step-by-step yields large translation quality improvements over conventional zero-shot prompting approaches and earlier human-like baseline strategies, resulting in state-of-the-art results on WMT2024.
Computation and Language
What problem does this paper attempt to address?
The paper attempts to address the issue of improving the quality of machine translation (MT) when dealing with long texts. Traditionally, machine translation has been viewed as a sequence transformation task that maps source language text to equivalent translations in the target language. However, this approach has limitations when handling complex, lengthy texts. The paper proposes a step-by-step translation method that breaks down the translation process into multiple subtasks, including pre-translation research, drafting, revising, and proofreading, to gradually improve translation quality through these steps. Specifically, the paper aims to explore whether large language models (LLMs) can achieve higher quality translations by mimicking the multi-step interactions in the human translation process. The main contributions of the paper are: 1. **Step-by-Step Translation Framework**: Proposes a new translation framework that decomposes the translation process into multiple stages, each with specific tasks and goals. 2. **Multi-Round Interaction**: Gradually improves translation quality through multiple rounds of interaction with LLMs, rather than completing the translation task in one go. 3. **Experimental Evidence**: Demonstrates the significant advantages of the step-by-step translation method across various language pairs, especially in long text translation, through extensive automatic evaluation on the WMT 2024 dataset. In summary, the paper attempts to significantly improve the quality of long text translation by introducing a step-by-step translation method that leverages the capabilities of LLMs to simulate the multi-stage interactions in the human translation process.