Using Ensemble Refinement (ER) Method to Optimize Transfer Set of Near-Infrared Spectra
Zheng Kai-yi,Zhang Wen,Ding Fu-yuan,Zhou Chen-guang,Shi Ji-yong,Yoshinori Marunaka,Zou Xiao-bo
DOI: https://doi.org/10.3964/j.issn.1000-0593(2022)04-1323-06
2022-01-01
Spectroscopy and spectral analysis
Abstract:The near-infrared spectra has been widely used in the food region with advantages of low measurement cost, easy operation, and fast analysis rate. An indirect analytical method should calibrate a feasible model between spectra and concentrations. However, the model calibrated under a specific condition may be invalid for the spectra measured under another condition. Recalibration is a solution to this problem. However, recalibrating the model between spectra and concentration cost much time and workforce. Thus, calibration transfer can correct the spectral deviation to keep the precision of prediction and avoid the expense of recalibration. In calibration transfer, the spectra used for calibrating model are called primary spectra (A), while those not calibrate model but only use the model of primary spectra are called secondary spectra (B). The procedure of calibration transfer is selecting samples as transfer set of primary spectra (A(t)) from the calibration set, while choosing the samples of secondary spectra as transfer-set of secondary spectra (B-t) who share the same concentrations of A(t). Then the transfer matrix can be constructed through A(t) and B-t. After that, the corrected secondary spectra (B-new) can be obtained by validating a set of secondary spectra (B-v) multiplying the transfer matrix. Finally, the B-new can be substituted for the primary spectra model for prediction. In calibration transfer, generating a transfer set is an important procedure. Selecting samples of transfer set is commonly based on the distances of spectra rather than validation errors. However, the transfer errors are important to estimate the power of calibration transfer. Hence, in this paper, ensemble refinement (ER) based on model population analysis has been proposed to refine further the transfer set generated by the KS method. Initially, the ER generates several subsets of a transfer set and then computes the validation errors of each subset. Subsequently the average error of subsets that includes the sample can be obtained for each sample. Finally, the samples with low average errors can be selected as a transfer set for calibration transfer. The corn dataset is used to examine this method. The results exhibited that in calibration transfer methods such as canonical correlation analysis combined with informative components extraction (CCA-ICE), direct standardization (DS), piecewise direct standardization (PDS) and spectral space transformation (SST) ER can select key samples for calibration transfer to reduce the errors, compared with KS method significantly.