Leveraging Denoised Abstract Meaning Representation for Grammatical Error Correction

Hejing Cao,Dongyan Zhao
2023-07-05
Abstract:Grammatical Error Correction (GEC) is the task of correcting errorful sentences into grammatically correct, semantically consistent, and coherent sentences. Popular GEC models either use large-scale synthetic corpora or use a large number of human-designed rules. The former is costly to train, while the latter requires quite a lot of human expertise. In recent years, AMR, a semantic representation framework, has been widely used by many natural language tasks due to its completeness and flexibility. A non-negligible concern is that AMRs of grammatically incorrect sentences may not be exactly reliable. In this paper, we propose the AMR-GEC, a seq-to-seq model that incorporates denoised AMR as additional knowledge. Specifically, We design a semantic aggregated GEC model and explore denoising methods to get AMRs more reliable. Experiments on the BEA-2019 shared task and the CoNLL-2014 shared task have shown that AMR-GEC performs comparably to a set of strong baselines with a large number of synthetic data. Compared with the T5 model with synthetic data, AMR-GEC can reduce the training time by 32\% while inference time is comparable. To the best of our knowledge, we are the first to incorporate AMR for grammatical error correction.
Computation and Language
What problem does this paper attempt to address?
The paper aims to address two main issues in the Grammatical Error Correction (GEC) task: 1. **Issues with Data Augmentation Methods**: Current popular GEC models either rely on large-scale synthetic corpora for training or depend on a large number of manually designed rules. The former has high training costs, while the latter requires extensive human expertise. 2. **AMR Reliability Issue**: The paper proposes using Abstract Meaning Representation (AMR) as additional knowledge to improve GEC models. However, since the AMR of erroneous sentences may be unreliable, directly incorporating it into GEC models might confuse the model. To address these issues, the authors propose a new framework called AMR-GEC, which combines denoised AMR as additional knowledge and designs a denoising semantic aggregation GEC model. Experiments have validated its effectiveness. Additionally, compared to models using synthetic data, AMR-GEC maintains comparable performance while reducing training time.