Automatic Detection of Chinese Generated Essayss Based on Pre-trained BERT

Xingyuan Chen,Peng Jin,Siyuan Jing,Chunming Xie
DOI: https://doi.org/10.1109/ITAIC54216.2022.9836571
2022-06-17
Abstract:The text generator based on the pre-trained language model has powerful generation ability., and essays generator based on pre-trained language model produces essays of high quality. Essays generators may be misused., for example by making some changes on the basis of the generated document., a essays of acceptable quality can be obtained. Corresponding efficient detection methods need to be developed. First., we built a essays text generator based on GPT-2 with training data., and then developed a generative essays detector with the pretrained language model BERT using the generated data and real data. Experiments show that the detector is 88% accurate for random sampling and 92% accurate for top-k sampling. Therefore., the detector based on the pretrained language model performs well and is an effective essays generation detector.
What problem does this paper attempt to address?