A Self-Supervised Automatic Post-Editing Data Generation Tool

Hyeonseok Moon,Chanjun Park,Sugyeong Eo,Jaehyung Seo,SeungJun Lee,Heuiseok Lim
DOI: https://doi.org/10.48550/arXiv.2111.12284
2021-11-24
Computation and Language
Abstract:Data building for automatic post-editing (APE) requires extensive and expert-level human effort, as it contains an elaborate process that involves identifying errors in sentences and providing suitable revisions. Hence, we develop a self-supervised data generation tool, deployable as a web application, that minimizes human supervision and constructs personalized APE data from a parallel corpus for several language pairs with English as the target language. Data-centric APE research can be conducted using this tool, involving many language pairs that have not been studied thus far owing to the lack of suitable data.
What problem does this paper attempt to address?