A Multi-feature Fusion Method with Attention Mechanism for Long Text Classification

Yuqi Liu,Tianning Li,Tian-jian Luo
DOI: https://doi.org/10.1145/3523089.3523093
2022-02-25
Abstract:As for the situation that the text content is long and contains much information irrelevant to the subject, which affects the performance of text classification. This paper proposes a multi-feature fusion method with attention mechanism for long text classification. Long text can be regarded as a hierarchical structure of sentences composed of words and paragraphs composed of sentences. Firstly, sentences are encoded and attention mechanism is introduced to aggregate into sentence level representation according to the different contributions of words. Then, based on the contribution of sentence level, aggregate the representation of growing text level. In sentence coding, based on the global target vector, convolutional neural network is used to extract the local features of words and average representation features of words, so as to further enhance the semantic representation of text. Finally, the important information features of long text content are fused and classified in the linear layer. The experimental results on manually processed THUCNews data show that the model has excellent classification performance in long text data with hierarchical structure, and the classification accuracy can reach 0.952.
Computer Science
What problem does this paper attempt to address?