An Adversarial Multi-Task Learning Method for Chinese Text Correction with Semantic Detection

Fanyu Wang,Zhenping Xie
DOI: https://doi.org/10.48550/arXiv.2306.16313
2023-06-28
Computation and Language
Abstract:Text correction, especially the semantic correction of more widely used scenes, is strongly required to improve, for the fluency and writing efficiency of the text. An adversarial multi-task learning method is proposed to enhance the modeling and detection ability of character polysemy in Chinese sentence context. Wherein, two models, the masked language model and scoring language model, are introduced as a pair of not only coupled but also adversarial learning tasks. Moreover, the Monte Carlo tree search strategy and a policy network are introduced to accomplish the efficient Chinese text correction task with semantic detection. The experiments are executed on three datasets and five comparable methods, and the experimental results show that our method can obtain good performance in Chinese text correction task for better semantic rationality.
What problem does this paper attempt to address?
The paper aims to address the issue of complex semantic errors in Chinese text correction tasks. Specifically, the authors propose a new adversarial multi-task learning method to uniformly handle character-level and phrase-level correction problems. This method improves the model's ability to model and detect the polysemy of Chinese characters by introducing a masked language model and a scoring language model as a pair of adversarial learning tasks. Additionally, to achieve efficient Chinese text correction tasks, the paper introduces the Monte Carlo Tree Search (MCTS) strategy and policy network to enhance the computational efficiency and accuracy of error location search. Experimental results show that this method outperforms existing methods on 3 datasets and has significant advantages in correction scenarios of different lengths. Overall, the main contribution of the paper is the proposal of a new framework that can effectively handle complex semantic errors in Chinese text.