Feature Selection Based on Information Gain and GA

REN Jiang-Tao,SUN Jing-Hao,HUANG Huan-Yu,YIN Jian
2006-01-01
Computer Science
Abstract:Feature selection is one of the important problems in the pattern recognition and data mining areas. For high-dimensional data, feature selection not only can improve the accuracy and efficiency of classification, but also can discover informative feature subset. This paper proposes a new feature selection method combining filter and wrapper models, which first filters features by feature partition based on information gain, and realizes the near optimal feature subset search on the compact representative feature subset by genetic algorithm; and the feature subset is evaluated by the classification inaccuracy of the perceptron model. The experiments show that the proposed algorithm can find the feature subsets with good linear separability, which results in the low-dimensional data and the good classification accuracy.
What problem does this paper attempt to address?