Identification of main crops based on the univariate feature selection in Subei

WANG Na,LI Qiangzi,DU Xin,ZHANG Yuan,ZHAO Longcai,WANG Hongyan
DOI: https://doi.org/10.11834/jrs.20176373
2017-01-01
Journal of Remote Sensing
Abstract:Timely accurate crop type identification and Crop Acreage Estimates (CAE) are essential for food security.Remote sensing technology has been successfully applied to crop identification because of its macro,rapid monitoring capabilities at large scales and its ability to quickly obtain accurate agricultural information.However,when identifying crop types,both simple and too many identifiable features might lead to low classification accuracies.Thus,multi-source and optimally selected features are obviously crucial to crop classification using remotely-sensed images.This paper considered a series of features,including multi-temporal spectra,vegetation indexes,textures,and band differences.Multiple experiments were designed and conducted in Sihong County,Jiangsu Province,China using Gaofen-1 and Huanjing-1 images to evaluate the influence of different features on the identification accuracy and determine the combination of preferred features which can improve the classification effect.The combination of random forest classification and univariate feature selection methods was expected to have a considerably positive effect on distinguishing and extracting the main crops in remote sensing images.In this study,the crop classification was implemented using random forests and univariate feature selection.The random forest method,which constructs many CART decision trees during each classification process,is one of themost effective classification methods.Univariate feature selection is a statistical testing method,which tests each feature to measure the relationship between the feature and the corresponding variable and then removes features that obtain low scores.First,the random forest classifier was applied to classify the images using the preceding multisource features mentioned.Second,we analyzed the contributions of different types of features or feature combinations to the classification accuracy.Third,features were selected by using the univariate feature selection method.Finally,we re-combined the optimal features and random forest to classify the image and distinguish the main crop types with high accuracy.The results showed that overall classification accuracy based on the combination of optimal features reached 97.07% with the corresponding Kappa coefficient being 0.96,which indicated that the feature selection method used in this paper has a considerably positive effect on high classification accuracy because it efficiently reduced feature dimension.The classification results also showed that the crop classification using multi-source features outperformed the one which only used spectral features.In addition,the accuracy of the experiment which simultaneously used spectral and VI features was the second highest among all experiments.The optimal feature combination has 19 features,including five spectral features,six vegetation indexes,seven band difference features,and 1 texture feature,which suggested that vegetation indexes and band differences were more important to the crop identification than the other two.This study demonstrated the following:(1) the addition of different types of features could improve classification accuracy;(2) too many features would decrease classification accuracies;(3) univariate feature selection was effective for choosing the optimal subset of features.The optimally selected features can be relatively beneficial to reduce the computation load and improve the worse accuracies caused by applied features blindly.Therefore,the combination of random forest and univariate feature selection is effective in improving classification accuracy and efficiency.
What problem does this paper attempt to address?