Text Classification Method Based On Convolution Neural Network
Lin Li,Linlong Xiao,Nanzhi Wang,Guocai Yang,Jianwu Zhang
DOI: https://doi.org/10.1109/compcomm.2017.8322884
2017-01-01
Abstract:Automatic text classification is a fundamental task in the field of natural language processing and it can help users select vital information from massive text resources. To better represent the semantic meaning of a text, and to solve the problem that traditional methods need to extract features manually, we use TF-IDF algorithm to calculate the weight of each word in a text, then weight the word vectors by TF-IDF value. This method will generate text vectors, which have clearer semantic meanings. Then we input the text vector matrix into Convolution Neural Network (CNN), so that the CNN will automatically extract text features. Through extensive experiments conducted on two data sets, experiments demonstrate that our approach can effectively improve the accuracy of classification, and the classification accuracy of the two data sets are 96.28% and 96.97% respectively.