An Efficient Character-Level and Word-Level Feature Fusion Method for Chinese Text Classification

Jin Wenzhen,Zhu Hong,Yang Guocai
DOI: https://doi.org/10.1088/1742-6596/1229/1/012057
2019-01-01
Journal of Physics Conference Series
Abstract:In order to extract semantic feature information between texts more efficiently and reduce the effect of text representation on classification results, we propose a features fusion model C_BiGRU_ATT based on deep learning. The core task of our model is to extract the context information and local information of the text using Convolutional Neural Network(CNN) and Attention-based Bidirectional Gated Recurrent Unit(BiGRU) at character-level and word-level. Our experimental results show that the classification accuracies of C_BiGRU_ATT reach 95.55% and 95.60% on two Chinese datasets THUCNews and WangYi respectively. Meanwhile, compared with the single model based on character-level and word-level for CNN, the classification accuracies of C_BiGRU_ATT is increased by 1.6%, 2.7% on the THUCNews, and is increased by 0.6%, 5.2% on the WangYi. The results show that the proposed model C_BiGRU_ATT can extract text features more effectively.
What problem does this paper attempt to address?