History Attention for Source-Target Alignment in Neural Machine Translation.

Yan Huang,Wenhan Chao,Peidong Zhang,Yuanyuan Yu
DOI: https://doi.org/10.1109/icaci.2018.8377531
2018-01-01
Abstract:Attention mechanism has enhanced state-of-the-art Neural Machine Translation (NMT) by focusing on parts of the source sentence when predicting each target word. However we find that most of the attention context vector calculation is directly dependent on the current decoder hidden state. It tends to ignore past translated information, which often leads to over-translation and under-translation. When target sentence is very long, or the words relation inside the sentence are not tight, for example, there are some separators in the sentence, the model can get wrong translation. Aiming to solve these problems, in this paper, we propose a history attention structure that takes advantage of translated information. This architecture easily captures history information, helps model alleviate the memory vanishing problem introduced by long sentences and avoid focusing on one local part. In experiments, we show our history attention with gate improves translation quality.
What problem does this paper attempt to address?