Feature vector optimization method for text classification

Zhengbin Guo,Yangsen Zhang,Yuru Jiang
DOI: https://doi.org/10.3969/j.issn.1001-3695.2017.08.013
2017-01-01
Abstract:It is a general method that using vector space model to construct a vector to represent text.There are two methods to optimize the text vector: adjust weights or adjust dimensions.This paper proposed a novel feature vector optimization method for text classification.First it optimized the features in text vector by removing the synonyms.Second it proposed a novel concept——contributor factor to optimize the feature value.Result shows that the text classification accuracy of this work is increased by 0.96% compared with the Naive Bayesian method.Therefore, by removing synonyms and adjusting the weight of the feature words, it can achieve the goal of optimizing the text vector and improving the accuracy of text classification.
What problem does this paper attempt to address?