Academic Article Classification Algorithm Based on Pre-trained Model and Keyword Extraction

Zekai Zhou,Dongyang Zheng,Zihan Qiu,Ronghua Lin,Zhengyang Wu,Chengzhe Yuan
DOI: https://doi.org/10.1007/978-981-19-4549-6_12
2022-01-01
Abstract:Text classification, which has extensive application in many fields, assigns a tag to a given piece of text. Academic articles are the most authoritative source of academic information and play an important role in the process of delivering latest academic information. On social media, these academic articles will generate considerable academic news, translated articles, tutorial articles, etc. How to classify these academic articles has become more and more important. In this paper, we employ pre-trained model for academic text classification. Further, to identify terminology in academic papers, we design a convolutional layer to capture local dependencies. We also introduce a max-pooling layer that can get the most important elements in the text. Considering that academic articles are usually long, we propose a fine-tuning technique based on keyword extraction for pre-trained model to obtain global information. We conduct experiments on the Fudan Text Classification Corpus and the SCHOLAT academic news dataset. The experimental results show that the proposed method outperforms the methods commonly used in recent years on both datasets.
What problem does this paper attempt to address?