Multi-granularity Knowledge Sharing in Low-resource Neural Machine Translation

Chenggang Mi,Shaoliang Xie,Yi Fan
DOI: https://doi.org/10.1145/3639930
IF: 1.471
2024-02-08
ACM Transactions on Asian and Low-Resource Language Information Processing
Abstract:As the rapid development of deep learning methods, neural machine translation (NMT) has attracted more and more attention in recent years. However, lack of bilingual resources decreases the performance of the low-resource NMT model seriously. To overcome this problem, several studies put their efforts on knowledge transfer from high-resource language pairs to low-resource language pairs. However, these methods usually focus on one single granularity of language and the parameter sharing among different granularities in NMT is not well studied. In this article, we propose to improve the parameter sharing in low-resource NMT by introducing multi-granularity knowledge such as word, phrase and sentence. This knowledge can be monolingual and bilingual. We build the knowledge sharing model for low-resource NMT based on a multi-task learning framework, three auxiliary tasks such as syntax parsing, cross-lingual named entity recognition, and natural language generation are selected for the low-resource NMT. Experimental results show that the proposed method consistently outperforms six strong baseline systems on several low-resource language pairs.
computer science, artificial intelligence
What problem does this paper attempt to address?