A Simple Multiple Linear Regression Model in Near Infrared Spectroscopy for Soluble Solids Content of Pomegranate Arils Based on Stability Competitive Adaptive Re-Weighted Sampling

Zhaoqiong Jiang,Yiping Du,Fangping Cheng,Feiyu Zhang,Wuye Yang,Yinran Xiong
DOI: https://doi.org/10.1177/0967033520982366
2021-01-01
Journal of Near Infrared Spectroscopy
Abstract:The objective of this study was to develop a multiple linear regression (MLR) model using near infrared (NIR) spectroscopy combined with chemometric techniques for soluble solids content (SSC) in pomegranate samples at different storage periods. A total of 135 NIR diffuse reflectance spectra with the wavelength range of 950-1650 nm were acquired from pomegranate arils. Based upon sampling error profile analysis, outlier diagnosis was conducted to improve the stability of the model, and four outliers were removed. Several pretreatment and variable selection methods were compared using partial least squares (PLS) regression models. The overall results demonstrated that the pretreatment using the first derivative (1D) was very effective and the variable selection method of stability competitive adaptive re-weighted sampling (SCARS) was powerful for extracting feature variables. The equilibrium performance of 1D-SCARS-PLS regression model over ten repeats was similar to 1D-PLS regression model, so that the advantage of wavelength selection was inconspicuous in PLS regression model. However, the number of variables selected by 1D-SCARS was less than 9, which was enough to establish a simple MLR model. The performance of MLR model for SSC of pomegranate arils based on 1D-SCARS achieved a root-mean-square error of calibration of 0.29% and prediction of 0.31%. This strategy combining variable selection method with MLR may have a broad prospect in the application of NIR spectroscopy due to its simplicity and robustness.
What problem does this paper attempt to address?