Abstract:Based on a unified encoder-decoder framework with attentional mechanism, neural machine translation (NMT) models have attracted much attention and become the mainstream in the community of machine translation. Generally, the NMT decoders produce translation in a left-to-right way. As a result, only left-to-right target-side contexts from the generated translations are exploited, while the right-to-left target-side contexts are completely unexploited for translation. In this paper, we extend the conventional attentional encoder-decoder NMT framework by introducing a backward decoder, in order to explore asynchronous bidirectional decoding for NMT. In the first step after encoding, our backward decoder learns to generate the target-side hidden states in a right-to-left manner. Next, in each timestep of translation prediction, our forward decoder concurrently considers both the source-side and the reverse target-side hidden states via two attention models. Compared with previous models, the innovation in this architecture enables our model to fully exploit contexts from both source side and target side, which improve translation quality altogether. We conducted experiments on NIST Chinese-English, WMT English-German and Finnish-English translation tasks to investigate the effectiveness of our model. Experimental results show that (1) our improved RNN-based NMT model achieves significant improvements over the conventional RNNSearch by 1.44/-3.02, 1.11/-1.01, and 1.23/-1.27 average BLEU and TER points, respectively; and (2) our enhanced Transformer outperforms the standard Transformer by 1.56/-1.49, 1.76/-2.49, and 1.29/-1.33 average BLEU and TER points, respectively. We released our code at https://github.com/DeepLearnXMU/ABD-NMT.

Mutual Information and Diverse Decoding Improve Neural Machine Translation.

A Simple, Fast Diverse Decoding Algorithm for Neural Generation

Increasing Visual Awareness in Multimodal Neural Machine Translation from an Information Theoretic Perspective

Asynchronous Bidirectional Decoding for Neural Machine Translation.

Exploring Multi-Stage Information Interactions for Multi-Source Neural Machine Translation

Bilingual Mutual Information Based Adaptive Training for Neural Machine Translation

Better Simultaneous Translation with Monotonic Knowledge Distillation.

Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation

Improving Neural Machine Translation Model with Deep Encoding Information

Data Diversification: A Simple Strategy For Neural Machine Translation

Exploiting Reverse Target-Side Contexts for Neural Machine Translation Via Asynchronous Bidirectional Decoding

Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input

Reciprocal Supervised Learning Improves Neural Machine Translation

Helping the Weak Makes You Strong: Simple Multi-Task Learning Improves Non-Autoregressive Translators

Bi-Decoder Augmented Network for Neural Machine Translation.

Deconvolution-Based Global Decoding for Neural Machine Translation.

Decoding and Diversity in Machine Translation

Latent Attribute Based Hierarchical Decoder for Neural Machine Translation.

Neural System Combination For Machine Translation

Neural Machine Translation by Jointly Learning to Align and Translate

Mitigating Data Imbalance and Representation Degeneration in Multilingual Machine Translation