Ensemble Partial Least Squares Algorithm Based on Variable Clustering for Quantitative Infrared Spectrometric Analysis
Bi Yi-Ming,Chu Guo-Hai,Wu Ji-Zhong,Yuan Kai-Long,Wu Jian,Liao Fu,Xia Jun,Zhang Guang-Xin,Zhou Guo-Jun
DOI: https://doi.org/10.1016/s1872-2040(15)60842-8
2015-01-01
Abstract:Because of the ability of overcoming both the dimensionality and the collinear problems of the spectral data, partial least squares (PLS) is increasingly used for quantitative spectrometric analysis, particularly for near-infrared spectrum, mid-infrared spectrum and Raman spectrum. In this study, an improved PLS algorithm was proposed for efficient information extraction and noise reduction. The spectral variables were clustering to several subsets, and corresponding sub-models were built for each subset. Then, the sub-models were re-weighted and integrated to the final model. The experimental results on two near-infrared datasets (octane number prediction in gasoline and nicotine prediction in tobacco leafs) demonstrated that this method provided superior prediction performance and outperformed the conventional PLS algorithm, and the root mean square error for prediction set (RMSEP) was reduced by 32% and 22%, respectively.