Chinese Text Classification Based on Hybrid Model of CNN and LSTM

Xuewei Li,Hongyun Ning
DOI: https://doi.org/10.1145/3414274.3414493
2020-07-24
Abstract:Text classification is one of the basic tasks of natural language processing. In recent years, deep learning has been widely used in text classification tasks. The representative one is the convolutional neural network. The convolutional neural network(CNN) is limited by the size of the local window and can only extract local features of the text. For long texts like news, CNN cannot learn the longterm dependence of the long text. Another model of deep learning recurrent neural networks based on long short-term memory (LSTM) can learn the long-term dependence of text. Therefore, in the work of this paper, combining the advantages of CNN and LSTM, a LSTM_CNN Hybrid model is constructed for Chinese news text classification tasks. We first use LSTM to learn the longterm dependence of text, then we design a shallow convolution structure to further extract the semantic features of the text, and finally use the max-pooling operation to filter to obtain important features for classification. The model we proposed has achieved good results on the News dataset.
What problem does this paper attempt to address?