Abstract:In this article, we show that word translations can be explicitly incorporated into NMT effectively to avoid wrong translations. Specifically, we propose three cross-lingual encoders to explicitly incorporate word translations into NMT: (1) Factored encoder, which encodes a word and its translation in a vertical way; (2) Gated encoder, which uses a gated mechanism to selectively control the amount of word translations moving forward; and (3) Mixed encoder, which stitchingly learns a word and its translation annotations over sequences where words and their translations are alternatively mixed. Besides, we first use a simple word dictionary approach and then a word sense disambiguation (WSD) approach to effectively model the word context for better word translation. Experimentation on Chinese-to-English translation demonstrates that all proposed encoders are able to improve the translation accuracy for both traditional RNN-based NMT and recent self-attention-based NMT (hereafter referred to as Transformer). Specifically, Mixed encoder yields the most significant improvement of 2.0 in BLEU on the RNN-based NMT, while Gated encoder improves 1.2 in BLEU on Transformer. This indicates the usefulness of an WSD approach in modeling word context for better word translation. This also indicates the effectiveness of our proposed cross-lingual encoders in explicitly modeling word translations to avoid wrong translations in NMT. Finally, we discuss in depth how word translations benefit different NMT frameworks from several perspectives.

Modeling Discourse Structure for Document-level Neural Machine Translation

Exploring Discourse Structure in Document-level Machine Translation

Context Modeling with Hierarchical Shallow Attention Structure for Document-Level NMT

Document Graph for Neural Machine Translation

A Hierarchy-to-Sequence Attentional Neural Machine Translation Model.

Modeling Coherence for Discourse Neural Machine Translation

Paragraph-Parallel Based Neural Machine Translation Model with Hierarchical Attention

Document-level Neural Machine Translation with Inter-Sentence Attention

Paragraph-Level Hierarchical Neural Machine Translation

Document Sub-structure in Neural Machine Translation

Towards Making the Most of Context in Neural Machine Translation

Document-level Neural Machine Translation with Document Embeddings

Multi-Hop Transformer for Document-Level Machine Translation

Explicitly Modeling Word Translations in Neural Machine Translation

Evaluating Discourse Phenomena in Neural Machine Translation

TransSent: Towards Generation of Structured Sentences with Discourse Marker

Improved Discourse Parsing with Two-Step Neural Transition-Based Model

Rethinking Document-level Neural Machine Translation

Only 5\% Attention Is All You Need: Efficient Long-range Document-level Neural Machine Translation

Learning Contextualized Sentence Representations for Document-Level Neural Machine Translation

Leveraging Discourse Rewards for Document-Level Neural Machine Translation