Bidirectional Boost: On Improving Tibetan-Chinese Neural Machine Translation With Back-Translation and Self-Learning

Sangjie Duanzhu,Rui Zhang,Cairang Jia
DOI: https://doi.org/10.1145/3446132.3446405
2020-12-24
Abstract:Despite the remarkable success of Neural Machine Translation system, such challenges as its drawback in low-resourced conditions persist. In recent years, working mechanism of exploiting either one or both source and target side monolingual data within the Neural Machine Translation framework gained much attention in the field. Among many supervised and unsupervised proposals, back translation is increasingly seen as one of the most promising methods to improve low-resource NMT performance. Regardless of its simplicity, the effectiveness of back translation is highly dependent on performance of the backward model which is initially trained on available parallel data. To address the dilemma of back translation practices in low resource scenarios, we propose to employ target-side monolingual data to improve both backward and forward models by step-wise adoption of self-learning and back translation, which we refer to as Bidirectional Boost.Our experiments on a Tibetan-Chinese translation task attested the proposed approach with a result of producing 3.1 and 8.2 BLEU scores, respectively, both on forward and backward models over vanilla Transformers trained on genuine parallel data under supervised settings.
What problem does this paper attempt to address?