Methods for performing dimensionality reduction in hyperspectral image classification
Jun-Li Xu,Carlos Esquerre,Da-Wen Sun
DOI: https://doi.org/10.1177/0967033518756175
2018-02-01
Journal of Near Infrared Spectroscopy
Abstract:This paper provides several useful strategies for performing the dimensionality reduction in hyperspectral imaging data, with detailed command line scripts in the Matlab computing language as the supplementary data. Due to the vast number of data dimensionality reduction methods available, this paper will mainly focus on some commonly used approaches adopted in hyperspectral imaging. In this work, transformation-based methods include principal component analysis and linear discriminant analysis, while band selection methods are comprised of partial least squares regression combined with the variable importance in the projection scores, selectivity ratio, and significance multivariate correlation; Monte Carlo sampling-based methods including enhanced Monte Carlo variable selection and competitive adaptive reweighted sampling; model population analysis-based methods from libPLS including uninformative variable elimination, random frog, and PHADIA; Matlab built-in functions for feature selection including Relieff, stepwise regression, and sequential feature selection; and the selection method guided by genetic algorithm. The example data included in supplementary material, also available for download, will be used to simplify decision tree models for differentiation of white stripe and red muscle pixels on salmon fillets, since classification is one of the main application domains of hyperspectral imaging. In this work, there are many original codes and functions developed, such as fast multiple scattering correction preprocessing, outlier detection, optimal cutoff value determination, spikes, and dead spectra identification and correction for hyperspectral image. More importantly, a further selection function based on variance inflation factor is proposed to diagnose and alleviate collinearity problem because collinearity and multicollinearity are always expected to be severe in the spectral data. In this work, step-by-step procedure is provided for easy adaptation of these strategies to individual case.
spectroscopy,chemistry, applied