A heuristic feature selection approach for text categorization by using chaos optimization and genetic algorithm

Hao Chen,Wen Jiang,Canbing Li,Rui Li
DOI: https://doi.org/10.1155/2013/524017
IF: 1.43
2013-01-01
Mathematical Problems in Engineering
Abstract:Due to the era of Big Data and the rapid growth in textual data, text classification becomes one of the key techniques for handling and organizing the text data. Feature selection is the most important step in automatic text categorization. In order to choose a subset of available features by eliminating unnecessary features to the classification task, a novel text categorization algorithm called chaos genetic feature selection optimization is proposed. The proposed algorithm selects the optimal subsets in both empirical and theoretical work in machine learning and presents a general framework for text categorization. Experimental results show that the proposed algorithm simplifies the feature selection process effectively and can obtain higher classification accuracy with a smaller feature set.
What problem does this paper attempt to address?