Semi-supervised neural machine translation via marginal distribution estimation

Yijun Wang, Yingce Xia, Li Zhao, Jiang Bian, Tao Qin, Enhong Chen, Tie-Yan Liu
2019-06-06
Abstract:Neural machine translation (NMT) heavily relies on parallel bilingual corpora for training. Since large-scale, high-quality parallel corpora are usually costly to collect, it is appealing to exploit monolingual corpora to improve NMT. Inspired by the law of total probability, which connects the probability of a given target-side monolingual sentence to the conditional probability of translating from a source sentence to the target one, we propose to explicitly exploit this connection and help the training procedure of NMT models using monolingual data. The key technical challenge of this approach is that there are exponentially many source sentences for a target monolingual sentence while computing the sum of the conditional probability given each possible source sentence. We address this challenge by leveraging the reverse translation model (target-to-source translation model) to sample several mostly likely source-side …
What problem does this paper attempt to address?