Self-training vs Pre-trained Embeddings for Automatic Essay Scoring.

Xianbing Zhou,Liang Yang,Xiaochao Fan,Ge Ren,Yang Yong,Hongfei Lin
DOI: https://doi.org/10.1007/978-3-030-88189-4_12
2021-01-01
Abstract:People usually believe that using pre-trained word vectors or pre-trained language models can effectively improve task performance. But that is not the case. A sufficient amount of annotated data is usually required to fine-tune the pre-trained language model and pre-trained word vectors for downstream tasks. In addition, the relevance of the training corpus and task corpus also affects task performance to a large extent. In this paper, we systematically compared the effects of different types of pre-trained embeddings and self-training embeddings on the performance of AES. At the same time, we propose an effective solution to the above problem, an automatic essay scoring method that includes pre-trained and self-training word embeddings. We conducted experiments on a public available dataset, including 8 subsets, and the experimental results show the effectiveness of this method.
What problem does this paper attempt to address?