Twin-GAN for Neural Machine Translation

Jiaxu Zhao,Li Huang,Ruixuan Sun,Liao Bing,Hong Qu
DOI: https://doi.org/10.5220/0010217300870096
2021-01-01
Abstract:In recent years, Neural Machine Translation (NMT) has achieved great success, but we can not ignore two important problems. One is the exposure bias caused by the different strategies between training and inference, and the other is that the NMT model generates the best candidate word for the current step yet a bad element of the whole sentence. The popular methods to solve these two problems are Schedule Sampling and Generative Adversarial Networks (GANs) respectively, and both achieved some success. In this paper, we proposed a more precise approach called "similarity selection" combining a new GAN structure called twin-GAN to solve the above two problems. There are two generators and two discriminators in the twin-GAN. One generator uses the "similarity selection" and the other one uses the same way as inference (simulate the inference process). One discriminator guides generators at the sentence level, and the other discriminator forces these two generators to have similar distributions. Moreover, we performed a lot of experiments on the IWSLT 2014 German -> English (De -> En) and the WMT'17 Chinese -> English (Zh -> En) and the result shows that we improved the performance compared to some other strong baseline models which based on recurrent architecture.
What problem does this paper attempt to address?