Abstract:This paper proposes an English-Chinese machine translation research method based on transfer learning. First, it expounds the theory of neural machine translation and transfer learning and related technologies. Neural machine translation is discussed, the advantages and disadvantages of various models are introduced, and the transformer neural machine translation model framework is selected. For low-resource Chinese-English parallel corpus and Tibetan-Chinese parallel corpus, 30 million Chinese-English parallel corpora, 100,000 Chinese-English low-resource parallel corpora, and 100,000 Tibetan-Chinese parallel corpora were used to pretrain the transformer machine translation architecture. The decoders are all composed of 6 identical hidden layers, the initialization of the model parameters is done by the transformer uniform distribution, and the model training uses Adam as the optimizer. In the model transfer part, the parameters with the better effect of the pretrained model are transferred to the low-resource Chinese-English and Tibetan-Chinese machine translation model training, so as to achieve the purpose of knowledge transfer. The results show that the model transfer learning of low-resource Chinese-English parallel corpus improves the translation system's translation by 3.97 BLEU values compared with the translation system without transfer learning at 0.34 BLEU values. Model transfer learning on low-resource Tibetan-Chinese parallel corpus increases the BLEU value by 2.64 BLEU compared to the translation system without transfer learning. The neural machine translation system that uses BPE technology for preprocessing plus model transfer learning is compared to the translation system that only performs transfer learning and shows an improved 0.26 BLEU value.It is verified that the transfer learning method proposed in this paper has a certain improvement in the effect of low-resource Chinese-English and Tibetan-Chinese neural machine translation models.

Machine Translation Model based on Non-parallel Corpus and Semi-supervised Transductive Learning

Semi-Supervised Learning for Neural Machine Translation

A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models

Bilingual Corpus Mining and Multistage Fine-Tuning for Improving Machine Translation of Lecture Transcripts

English-Chinese Machine Translation Based on Transfer Learning and Chinese-English Corpus

Unsupervised Parallel Corpus Mining on Web Data

Bilingual lexicon induction from non-parallel corpora

Iterative Learning of Parallel Lexicons and Phrases from Non-Parallel Corpora

Sentence Alignment with Parallel Documents Facilitates Biomedical Machine Translation

Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation

Semi-Supervised Neural Machine Translation Via Marginal Distribution Estimation

Automatic Translating Between Ancient Chinese and Contemporary Chinese with Limited Aligned Corpora.

Obtaining Parallel Sentences in Low-Resource Language Pairs with Minimal Supervision

Generating Virtual Parallel Corpus - A Compatibility Centric Method.

Study on Machine Translation Teaching Model Based on Translation Parallel Corpus and Exploitation for Multimedia Asian Information Processing

Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora.

Sentence Alignment with Parallel Documents Helps Biomedical Machine Translation

Reciprocal Supervised Learning Improves Neural Machine Translation

UM-Corpus: A Large English-Chinese Parallel Corpus for Statistical Machine Translation

Continual Knowledge Distillation for Neural Machine Translation

Multi-domain machine translation enhancements by parallel data extraction from comparable corpora