Improving Short Text Classification Through Better Feature Space Selection

Meng Wang,Lanfen Lin,Feng Wang
DOI: https://doi.org/10.1109/cis.2013.32
2013-01-01
Abstract:Nowadays people are overwhelmed by more and more short information from lots of different applications, especially with the rapid development of mobile systems. One way to alleviate this issue is an automatic classification of the short texts before they are delivered to users. Several methods have been proposed to classify the short texts, and they are largely based on expanding the short texts to longer ones with external resources to solve the sparseness problem. Different from these studies, we tackle the sparseness problem by selecting a better feature space in which the feature vectors of the short texts are denser, and our method needs no external resources at all. The experimental results on an open dataset show that this method can significantly improve the short text classification accuracy comparing with the baseline, especially when the dimension of the feature space is low.
What problem does this paper attempt to address?