Application of High Dimensional Feature Grouping Method in Near-Infrared Spectra of Identification of Tobacco Growing Areas

Cheng Zhu,Huili Gong,Zhongren Li,Chunxia Yu
DOI: https://doi.org/10.1109/icisce.2016.58
2016-01-01
Abstract:In order to increase the classification accuracy, the paper presents a novel feature grouping method, which is based on random forest variable importance measures. We applied the method to the classification of growing areas of tobacco and also compared it with other methods. The results showed that our proposed method efficiently got the optimal feature subset and can be used to identify the growing areas of tobacco. The feature grouping divided all features into different groups according to feature importance scores measured by random forest variable importance measures. The optimal feature subset was generated by continuous groups with important features, while the groups with irrelevant features were eliminated, which degraded the difficulty of feature selection. The experimental results demonstrated that our proposed method successfully eliminated the irrelevant features and got the optimal feature subset, leading to a significant improvement on the classification accuracy.
What problem does this paper attempt to address?