Abstract:Based on a unified encoder-decoder framework with attentional mechanism, neural machine translation (NMT) models have attracted much attention and become the mainstream in the community of machine translation. Generally, the NMT decoders produce translation in a left-to-right way. As a result, only left-to-right target-side contexts from the generated translations are exploited, while the right-to-left target-side contexts are completely unexploited for translation. In this paper, we extend the conventional attentional encoder-decoder NMT framework by introducing a backward decoder, in order to explore asynchronous bidirectional decoding for NMT. In the first step after encoding, our backward decoder learns to generate the target-side hidden states in a right-to-left manner. Next, in each timestep of translation prediction, our forward decoder concurrently considers both the source-side and the reverse target-side hidden states via two attention models. Compared with previous models, the innovation in this architecture enables our model to fully exploit contexts from both source side and target side, which improve translation quality altogether. We conducted experiments on NIST Chinese-English, WMT English-German and Finnish-English translation tasks to investigate the effectiveness of our model. Experimental results show that (1) our improved RNN-based NMT model achieves significant improvements over the conventional RNNSearch by 1.44/-3.02, 1.11/-1.01, and 1.23/-1.27 average BLEU and TER points, respectively; and (2) our enhanced Transformer outperforms the standard Transformer by 1.56/-1.49, 1.76/-2.49, and 1.29/-1.33 average BLEU and TER points, respectively. We released our code at https://github.com/DeepLearnXMU/ABD-NMT.

Non-autoregressive Translation with Dependency-Aware Decoder

Hybrid-Regressive Paradigm for Accurate and Speed-Robust Neural Machine Translation

Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input

Non-Autoregressive Machine Translation with Auxiliary Regularization

Minimizing the Bag-of-Ngrams Difference for Non-Autoregressive Neural Machine Translation

Guiding Non-Autoregressive Neural Machine Translation Decoding with Reordering Information

Context-Aware Cross-Attention for Non-Autoregressive Translation

Directed Acyclic Transformer for Non-Autoregressive Machine Translation.

Neighbors Are Not Strangers: Improving Non-Autoregressive Translation under Low-Frequency Lexical Constraints

Asynchronous Bidirectional Decoding for Neural Machine Translation.

Multi-Task Learning with Shared Encoder for Non-Autoregressive Machine Translation

NAT4AT: Using Non-Autoregressive Translation Makes Autoregressive Translation Faster and Better

Imitation Learning for Non-Autoregressive Neural Machine Translation.

Learning to Rewrite for Non-Autoregressive Neural Machine Translation.

Exploiting Reverse Target-Side Contexts for Neural Machine Translation Via Asynchronous Bidirectional Decoding

What Have We Achieved on Non-autoregressive Translation?

Improving Non-autoregressive Translation Quality with Pretrained Language Model, Embedding Distillation and Upsampling Strategy for CTC

Fuzzy Alignments in Directed Acyclic Graph for Non-Autoregressive Machine Translation

Glancing Transformer for Non-Autoregressive Neural Machine Translation.

Fine-Tuning by Curriculum Learning for Non-Autoregressive Neural Machine Translation

Non-Autoregressive Translation by Learning Target Categorical Codes.