Application of High-Dimensional Feature Selection in Near-Infrared Spectroscopy of Cigarettes' Qualitative Evaluation

Qin Yuhua,Ding Xiangqian,Gong Huili
DOI: https://doi.org/10.1080/00387010.2012.746373
2013-01-01
Spectroscopy Letters
Abstract:ABSTRACT In order to increase the classification accuracy, a new feature selection method, RFFIM-PCA, based on the random forest feature importance measure (RFFIM) and principal component analysis (PCA) for analyzing the near-infrared (NIR) spectra of tobacco, is presented in this paper. We applied the method to the classification of cigarettes' qualitative evaluation and also compared it with other methods. The result showed that RFFIM-PCA discriminates the high-dimensional data effectively and can be used to identify the cigarettes' quality. The feature selection filters the noises, while PCA eliminates the redundant features and reduces the dimensionalities as well. The experimental results showed that RFFIM-PCA successfully eliminated the noises and redundant features in high-dimensional data, leading to a promising improvement on the feature selection and classification accuracy.
What problem does this paper attempt to address?