Abstract:The evolution of Neural Machine Translation (NMT) has been significantly influenced by six core challenges (Koehn and Knowles, 2017), which have acted as benchmarks for progress in this field. This study revisits these challenges, offering insights into their ongoing relevance in the context of advanced Large Language Models (LLMs): domain mismatch, amount of parallel data, rare word prediction, translation of long sentences, attention model as word alignment, and sub-optimal beam search. Our empirical findings indicate that LLMs effectively lessen the reliance on parallel data for major languages in the pretraining phase. Additionally, the LLM-based translation system significantly enhances the translation of long sentences that contain approximately 80 words and shows the capability to translate documents of up to 512 words. However, despite these significant improvements, the challenges of domain mismatch and prediction of rare words persist. While the challenges of word alignment and beam search, specifically associated with NMT, may not apply to LLMs, we identify three new challenges for LLMs in translation tasks: inference efficiency, translation of low-resource languages in the pretraining phase, and human-aligned evaluation. The datasets and models are released at

What problem does this paper attempt to address?

The problems that this paper attempts to solve are: to re - examine the six classic challenges faced by large - language models (LLMs) in machine translation (MT) tasks, and whether these challenges still exist or have changed in the current context. Specifically, these six classic challenges include: 1. **Domain Mismatch**: - Problem description: Texts in different domains have differences in terms, styles, etc., which lead to a decline in the performance of translation systems when dealing with cross - domain texts. - Research findings: Although LLMs are exposed to a large amount of diverse data during the pre - training stage, when dealing with domain - specific texts, they still face problems such as term mismatches, style differences, and hallucination phenomena. 2. **Amount of Parallel Data**: - Problem description: Traditional neural machine translation (NMT) systems rely on a large amount of parallel corpora for training. Do LLMs still need a large amount of parallel data? - Research findings: LLMs reduce the dependence on parallel data of high - resource languages. A small amount of high - quality parallel data can significantly improve translation performance. However, too much parallel data may lead to a decline in performance instead. 3. **Rare Word Prediction**: - Problem description: How to accurately predict and translate rare words, such as proper nouns, compound words, etc. - Research findings: LLMs perform well in predicting high - frequency words, but for rare words that appear less than 8 times, their precision is low and the deletion rate is high. 4. **Translation of Long Sentences**: - Problem description: The translation of long sentences requires accurately capturing context information, which places higher requirements on the understanding ability of translation systems. - Research findings: LLMs perform excellently in translating long sentences (about 80 words) and document - level translation (up to 512 words), far exceeding traditional NMT models. 5. **Word Alignment**: - Problem description: Extract the word - alignment relationship between the source language and the target language through the attention mechanism to explain the working principle of the translation model. - Research findings: It is not feasible to extract word - alignment information from the attention weights of LLMs, but the aggregated attention weights can be used as clues to explain LLMs. 6. **Inference Efficiency**: - Problem description: The influence of the strategies (such as beam search and sampling) used in the inference process and inference efficiency on translation quality. - Research findings: Beam search is superior to sampling in BLEU score, but when dealing with rare words, sampling performs better. In addition, the inference efficiency of LLMs is much lower than that of traditional NMT models, resulting in an increase in latency. In addition, the paper also points out three new challenges: - Inference Efficiency - Pretraining Resource Imbalance for Low - Resource Languages - Human - Aligned Evaluation By re - examining these challenges, the paper provides valuable insights and directions for future research.

Salute the Classic: Revisiting Challenges of Machine Translation in the Age of Large Language Models

A Paradigm Shift: The Future of Machine Translation Lies with Large Language Models

Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis

Adaptive Machine Translation with Large Language Models

A Novel Paradigm Boosting Translation Capabilities of Large Language Models

What do Large Language Models Need for Machine Translation Evaluation?

Adapting Large Language Models for Document-Level Machine Translation

Mitigating the Language Mismatch and Repetition Issues in LLM-based Machine Translation via Model Editing

Quality or Quantity? On Data Scale and Diversity in Adapting Large Language Models for Low-Resource Translation

Document-Level Machine Translation with Large Language Models

A Review of Current Trends, Techniques, and Challenges in Large Language Models (LLMs)

Towards Effective Disambiguation for Machine Translation with Large Language Models

A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models

Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models

Scaling neural machine translation to 200 languages

Progress in Machine Translation

Language Models are Good Translators

Language Modelling Approaches to Adaptive Machine Translation

Exploring Human-Like Translation Strategy with Large Language Models

Six Challenges for Neural Machine Translation

Survey of different Large Language Model Architectures: Trends, Benchmarks, and Challenges