Controlling the Transition of Hidden States for Neural Machine Translation

Zaixiang Zheng,Shujian Huang,Xin-Yu Dai,Jiajun Chen
DOI: https://doi.org/10.1007/978-981-13-3083-4_8
2019-01-01
Abstract:Recurrent Neural Networks (RNN) based Neural Machine Translation (NMT) models under an encoder-decoder framework show significant improvements in translation quality recently. Given the encoded representations of source sentence, the NMT systems generate translated sentence word by word, dependent on the hidden states of the decoder. The hidden states of the decoder update at each decoding step, deciding the next translation to be generated. In this case, the transitions of the hidden states between successive steps contribute to the decisions of the next token of the translation, which draws less attention in previous work. In this work, we propose an explicit supervised objective on the transitions of the decoder hidden states, aiming to help our model to learn the transitional patterns better. We first attempt to model the increment of the transition by the proposed subtraction operation. Then, we require the increment to be predictive of the word to translate. The proposed approach strengthens the relationship between the transition of the decoder and the translation. Empirical evaluation shows considerable improvements on Chinese-English, German-English, and English-German translation tasks, demonstrating the effectiveness of our approach.
What problem does this paper attempt to address?