An Improved LA-Transformer Machine Translation Model

Zumin Wang,Chengye Zhang,Fengbo Bai,Yingjie Wang
DOI: https://doi.org/10.1109/swc57546.2023.10448708
2023-01-01
Abstract:As an important subtask in the field of natural language processing, machine translation can play a greater auxiliary role in the work of translators. Most current machine translation models are implemented using an end-to-end structure. However, regardless of the way of the absolute positional encoding method or the relative positional encoding method is adopted, it will lead to the problem of distraction of the attention mechanism when processing long sequences. To solve this problem, a local attention mechanism based on recurrent neural network is adopted, so that local position information and hidden information in sentences can be captured. At the same time, the local attention mechanism is introduced into the encoder part of Transformer, and experiments are carried out in Chinese-English and Japanese-English machine translation tasks. Compared with the baseline model, the experimental results show that the proposed model improves the BLEU score by 2.09 in the Chinese-English dataset, and increases the BLEU score by 1.34 in the Japanese-English dataset.
What problem does this paper attempt to address?