Abstract:In chemometrics, partial least squares (PLS) regression has become an established tool for spectroscopic analysis. Even so, to improve multivariate calibration models’ performance, variable selection is necessary for molecular spectroscopic analysis in many scenarios. However, applying only one variable selection method has many limitations. For example, individual wavelength selection methods may suffer from large computational requirements. In addition, wavelength intervals selection methods may suffer from the retention of uninformative/interfering wavelengths and the effect of collinearity in the selected wavelength intervals. Accordingly, a novel hybrid variable selection strategy, called memetic algorithm-interval partial least squares coupled Hilbert-Schmidt independence criterion based variable space iterative optimization (MA-iPLS + HSIC–VSIO), is proposed in this study. In the first step, a wavelength intervals selection method, MA-iPLS, is used to select optimal wavelength intervals. In the second step, a novel individual wavelength selection method, HSIC–VSIO, is employed for further optimizing the selected wavelength intervals. This hybrid variable selection strategy makes full use of MA-iPLS and HSIC–VSIO. To investigate the performance of MA-iPLS + HSIC–VSIO, it was tested on two groups of spectroscopic datasets: the surface-enhanced Raman scattering (SERS) spectra of chlorpyrifos standard solutions dataset and the near infrared (NIR) spectra of diesel fuels dataset. Nine methods, including PLS, HSIC–VISO, MA-iPLS, genetic algorithm-interval partial least squares (GA-iPLS), particle swarm optimization-interval partial least squares (PSO-iPLS), interval random frog (iRF), GA-iPLS + HSIC–VSIO, PSO-iPLS + HSIC–VSIO, and iRF + HSIC–VSIO, were also applied on the spectroscopic dataset for comparison. The results demonstrated the excellent performance of MA-iPLS + HSIC–VSIO for molecular spectroscopic analysis.

Variable-weighted PLS

Optimal Selection of Secondary Variables Based on GA-PLS Algorithm and Its Application to Soft Sensor Modeling

A Selective Moving Window Partial Least Squares Method and Its Application in Process Modeling

[Spectral Wavelength Selection Based on PLS Projection Analysis].

Subspace Partial Least Squares Model for Multivariate Spectroscopic Calibration

Soft variable selection combining partial least squares and attention mechanism for multivariable calibration

Application Of Partial Least Squares Support Vector Machines (Pls-Svm) In Spectroscopy Quantitative Analysis

Optimal Weighting Distance-Based Similarity for Locally Weighted PLS Modeling

Variable Selection in Discriminant Partial Least-Squares Analysis

A novel hybrid variable selection strategy with application to molecular spectroscopic analysis

LPV Model Identification Using Blended Linear Models with Given Weightings

[Near-Infrared Spectral Quantitative Analysis by Combining Classification with Local PLS].

A Boosting-Partial Least Squares Method For Ultraviolet Spectroscopic Analysis Of Water Quality

Improvement of PLS model transferability by robust wavelength selection

Variance constrained partial least squares

Simultaneous wavelength selection and outlier detection in multivariate regression of near-infrared spectra

A variable importance criterion for variable selection in near-infrared spectral analysis

Comparison of variable selection methods for PLS-based soft sensor modeling

Uninformative variable elimination for improvement of successive projections algorithm on spectral multivariable selection with different calibration algorithms for the rapid and non-destructive determination of protein content in dried laver

Model selection for partial least squares regression

A Novel Variable Selection Method Based on Binning-Normalized Mutual Information for Multivariate Calibration