Abstract:Autoregressive (AR) and Non-autoregressive (NAR) models are two types of generative models for Neural Machine Translation (NMT). AR models predict tokens in a word-by-word manner and can effectively capture the distribution of real translations. NAR models predict tokens by extracting bidirectional contextual information which can improve the inference speed but they suffer from performance degradation. Previous works utilized AR models to enhance NAR models by reducing the training data's complexity or incorporating the global information into AR models by virtue of NAR models. However, those investigated methods only take advantage of the contextual information of a single type of model while neglecting the diversity in the contextual information that can be provided by different types of models. In this paper, we propose a novel generic collaborative learning method, DCMCL, where AR and NAR models are treated as collaborators instead of teachers and students. To hierarchically leverage the bilateral contextual information, token-level mutual learning and sequence-level contrastive learning are adopted between AR and NAR models. Extensive experiments on four widely used benchmarks show that the proposed DCMCL method can simultaneously improve both AR and NAR models with up to 1.38 and 2.98 BLEU scores respectively, and can also outperform the current best-unified model with up to 0.97 BLEU scores for both AR and NAR decoding.

Learning Contextualized Sentence Representations for Document-Level Neural Machine Translation

Towards Making the Most of Context in Neural Machine Translation

Diving Deep into Context-Aware Neural Machine Translation

Exploiting Cross-Sentence Context for Neural Machine Translation

Context-Adaptive Document-Level Neural Machine Translation

Capturing document context inside sentence-level neural machine translation models with self-training

SMDT: Selective Memory-Augmented Neural Document Translation

Document-level Neural Machine Translation with Document Embeddings

Neural Machine Translation with Sentence-Level Topic Context

Improving Context-Aware Neural Machine Translation with Source-side Monolingual Documents.

Document-Level Machine Translation with Effective Batch-Level Context Representation

Improving Neural Machine Translation with Pre-trained Representation

Translation Prediction with Source Dependency-Based Context Representation.

Pretrained Language Models for Document-Level Neural Machine Translation

Context-Aware Learning for Neural Machine Translation

A Neural Approach to Source Dependence Based Context Model for Statistical Machine Translation

Context Modeling with Hierarchical Shallow Attention Structure for Document-Level NMT

Selective Memory-Augmented Document Translation with Diverse Global Context

Leveraging Diverse Modeling Contexts with Collaborating Learning for Neural Machine Translation

A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning

HanoiT: Enhancing Context-aware Translation via Selective Context