Abstract:Sequence generation tasks, such as neural machine translation NMT and abstractive summarization, usually suffer from exposure bias as well as the error propagation problem due to the autoregressive training and generation. Many previous works have discussed the relationship between error propagation and the accuracy drop problem i.e., the right part of the generated sentence is often worse than its left part in left-to-right decoding models. In this paper, taking NMT as a typical sequence generation task, we measure the accuracy of the generated sentence with various metrics and conduct a series of analyses to deeply understand the accuracy drop problem. We obtain several interesting findings. First, The role of error propagation on accuracy drop is overstated in the literature, although it is indeed a cause to the accuracy drop problem. Second, Characteristics of a language play a more important role in causing the accuracy drop problem: the left part of the generated sentence in a right-branching language e.g., English is more likely to be more accurate than its right part, while the right part is more accurate for a left-branching language e.g., Japanese. Our discoveries are also confirmed on other generation tasks e.g., image captioning, abstractive summarization and language modeling with multiple left/right-branching languages, as well as in various model structures.

Tackling Error Propagation through Reinforcement Learning: A Case of Greedy Dependency Parsing

When Are Tree Structures Necessary for Deep Learning of Representations?

Deep Reinforcement Learning for NLP.

Beyond Error Propagation: Language Branching Also Affects the Accuracy of Sequence Generation

A Neural Probabilistic Structured-Prediction Model for Transition-Based Dependency Parsing.

Grammatical Error Correction with Neural Reinforcement Learning

Monte Carlo Syntax Marginals for Exploring and Using Dependency Parses

Reinforcement Learning in Spiking Neural Networks with Stochastic and Deterministic Synapses

Beyond Error Propagation in Neural Machine Translation: Characteristics of Language Also Matter

Approximation Error Back-Propagation for Q-Function in Scalable Reinforcement Learning with Tree Dependence Structure

MAP Propagation Algorithm: Faster Learning with a Team of Reinforcement Learning Agents

How Important Is POS to Dependency Parsing? Joint POS Tagging and Dependency Parsing Neural Networks

Training with Exploration Improves a Greedy Stack-LSTM Parser

Incremental Parsing with Minimal Features Using Bi-Directional LSTM

Analysing Dropout and Compounding Errors in Neural Language Models

Transition-based Dependency Parsing Using Two Heterogeneous Gated Recursive Neural Networks.

Combating the Compounding-Error Problem with a Multi-step Model

Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint

Investigating Compounding Prediction Errors in Learned Dynamics Models

Evaluating and Enhancing the Robustness of Neural Network-based Dependency Parsing Models with Adversarial Examples

Measuring and Reducing Model Update Regression in Structured Prediction for NLP