Chinese text multi-classification based on Sentences Order Prediction improved Bert model

Guanping Fu,Jianwei Sun
DOI: https://doi.org/10.1088/1742-6596/2031/1/012054
2021-09-01
Journal of Physics: Conference Series
Abstract:Abstract For the strong noise interference brought by the NSP mechanism (Next Sentences Prediction) in Bert to the model, in order to improve the classification effect of the Bert model when it is used in text classification, an SOP (Sentences Order Prediction) mechanism is used to replace the Bert model of the NSP mechanism-Multi-classification of Chinese news texts. At first, use randomly sorted adjacent sentence pairs for segment embedding. Then use the Transformer structure of the Bert model to encode the Chinese text, and obtain the final CLS vector as the semantic vector of the text. Finally, connect the different semantic vectors to the multi-category Classification. After ablation experiments, the improved SOP-Bert model obtained the highest F1 value of 96.69. The results show that this model is more effective than the original Bert model on text multi-classification problems.
English Else
What problem does this paper attempt to address?