A new feature selection algorithm in text categorization

Wei Zhao,Yafei Wang,Dan Li
DOI: https://doi.org/10.1109/3CA.2010.5533870
2010-01-01
Abstract:A major problem with text classification problems is the high dimensionality of the feature space. This paper investigates how genetic algorithm and k-means algorithm can help select relevant features in text classification. which uses the genetic algorithm (GA) optimization features to implement global searching, and uses k-means algorithm to selection operation to control the scope of the search, ensure the validity of each gene and the speed of convergence. Our experimental results show that the combination of GA and k-means algorithm is quite useful in reduce the high feature dimension, and improved accuracy and efficiency for text classification.
What problem does this paper attempt to address?