Abstract:Grammatical error correction (GEC) is an important application aspect of natural language processing techniques, and GEC system is a kind of very important intelligent system that has long been explored both in academic and industrial communities. The past decade has witnessed significant progress achieved in GEC for the sake of increasing popularity of machine learning and deep learning. However, there is not a survey that untangles the large amount of research works and progress in this field. We present the first survey in GEC for a comprehensive retrospective of the literature in this area. We first give the definition of GEC task and introduce the public datasets and data annotation schema. After that, we discuss six kinds of basic approaches, six commonly applied performance boosting techniques for GEC systems, and three data augmentation methods. Since GEC is typically viewed as a sister task of Machine Translation (MT), we put more emphasis on the statistical machine translation (SMT)-based approaches and neural machine translation (NMT)-based approaches for the sake of their importance. Similarly, some performance-boosting techniques are adapted from MT and are successfully combined with GEC systems for enhancement on the final performance. More importantly, after the introduction of the evaluation in GEC, we make an in-depth analysis based on empirical results in aspects of GEC approaches and GEC systems for a clearer pattern of progress in GEC, where error type analysis and system recapitulation are clearly presented. Finally, we discuss five prospective directions for future GEC researches.

Grammatical Error Correction: More Data with More Context

Improving Grammatical Error Correction via Contextual Data Augmentation

Efficient Grammatical Error Correction Via Multi-Task Training and Optimized Training Schedule

Leveraging Denoised Abstract Meaning Representation for Grammatical Error Correction

Grammatical Error Correction with Dependency Distance

Improving Grammatical Error Correction with Data Augmentation by Editing Latent Representation

TransGEC: Improving Grammatical Error Correction with Translationese

Leveraging Adversarial Training to Facilitate Grammatical Error Correction

Adversarial Grammatical Error Correction

Grammatical Error Correction via Mixed-Grained Weighted Training

Synthetic Data Generation for Grammatical Error Correction with Tagged Corruption Models

Grammatical Error Correction: A Survey of the State of the Art

Spoken Language ‘grammatical Error Correction’

Online Self-boost Learning for Chinese Grammatical Error Correction

GEC-DePenD: Non-Autoregressive Grammatical Error Correction with Decoupled Permutation and Decoding

A Comprehensive Survey of Grammatical Error Correction.

Automatic Grammatical Error Correction Based on Edit Operations Information.

Pillars of Grammatical Error Correction: Comprehensive Inspection Of Contemporary Approaches In The Era of Large Language Models

Grammatical Error Correction as GAN-like Sequence Labeling

Do Grammatical Error Correction Models Realize Grammatical Generalization?

Improving Grammatical Error Correction Models with Purpose-Built Adversarial Examples