Long Text Classification Based on BERT

Ding Weijie,Li Yunyi,Zhang Jing,Shen Xuchen
DOI: https://doi.org/10.1109/itnec52019.2021.9587007
2021-10-15
Abstract:Existing text classification algorithms generally have limitations in terms of text length and yield poor classification results for long texts. To address this problem, we propose a BERT-based long text classification method. First, we slice the long text and use BERT to encode the sliced clauses to obtain the local semantic information. Second, we use BiLSTM to fuse the local semantic information and adopt the attention mechanism to increase the weight of important clauses in the long text, so as to obtain the global semantic information. Finally, the global semantic information is input to the softmax layer for classification. Experimental results show that the proposed method achieves higher accuracy than commonly used models.
What problem does this paper attempt to address?