A Genetic Algorithm Based Wrapper Feature Selection Method for Classification of Hyper Spectral Data Using Support Vector Maching
ZHUO Li,ZHENG Jing,WANG Fang,LI Xia,AI Bin,QIAN Jun-ping
DOI: https://doi.org/10.3321/j.issn:1000-0585.2008.03.002
2008-01-01
Geographical Research
Abstract:The high-dimensional feature vectors of hyper spectral data often impose a high computational cost as well as the risk of over fitting when classification is performed. Therefore it is necessary to reduce the dimensionality through ways like feature selection. Currently, there are two kinds of feature selection methods: filter methods and wrapper methods. The former kind requires no feedback from classifiers and estimates the classification performance indirectly. The latter kind evaluates the goodness of selected feature subset directly based on the classification accuracy. Many experimental results have proved that the wrapper methods can yield better performance, although they have the disadvantage of high computational cost. In this paper, we present a Genetic Algorithm (GA) based wrapper method for classification of hyper spectral data using Support Vector Machine (SVM), a state-of-art classifier that has found to be success in a variety of areas. The genetic algorithm (GA), which seeks to solve optimization problems using the methods of evolution, specifically survival of the fittest, was used to optimize both the feature subset, i.e. band subset, of hyper spectral data and SVM kernel parameters simultaneously. A special strategy was adopted to reduce computation cost caused by the high-dimensional feature vectors of hyper spectral data when the feature subset part of chromosome was designed. The GA-SVM method was realized using the ENVI/IDL language, and was then tested by applying a HYPERION hyper spectral image. Comparison of the optimized results and the un-optimized results showed that the GA-SVM method could significantly reduce the computation cost while improving the classification accuracy. The number of bands used for classification was reduced from 198 to 13, while the classification accuracy increased from 88.81% to 92.51%. The optimized values of the two SVM kernel parameters were 95.0297 and 0.2021, respectively, which were different from the default values as used in the ENVI software. In conclusion, the proposed wrapper feature selection method GA-SVM can optimize feature subsets and SVM kernel parameters at the same time, therefore can be applied in feature selection of the hyper spectral data.