Semi-Supervised Neural Machine Translation Via Marginal Distribution Estimation

Yijun Wang,Yingce Xia,Li Zhao,Jiang Bian,Tao Qin,Enhong Chen,Tie-Yan Liu
DOI: https://doi.org/10.1109/taslp.2019.2921423
2019-01-01
IEEE/ACM Transactions on Audio Speech and Language Processing
Abstract:Neural machine translation (NMT) heavily relies on parallel bilingual corpora for training. Since large-scale, high-quality parallel corpora are usually costly to collect, it is appealing to exploit monolingual corpora to improve NMT. Inspired by the law of total probability, which connects the probability of a given target-side monolingual sentence to the conditional probability of translating from a source sentence to the target one, we propose to explicitly exploit this connection and help the training procedure of NMT models using monolingual data. The key technical challenge of this approach is that there are exponentially many source sentences for a target monolingual sentence while computing the sum of the conditional probability given each possible source sentence. We address this challenge by leveraging the reverse translation model (target-to-source translation model) to sample several mostly likely source-side sentences and avoid enumerating all possible candidate source sentences. Then we propose two different methods to leverage the law of total probability, including marginal distribution regularization and likelihood maximization of monolingual corpora. Experiment results on English -> French and German -> English tasks demonstrate that our methods achieve significant improvement over several strong baselines.
What problem does this paper attempt to address?