Machine Learning Modeling and Prediction of Peanut Protein Content Based on Spectral Images and Stoichiometry

Man Zhou,Li Wang,Hejun Wu,Qingye Li,Meiliang Li,Zhiqing Zhang,Yongpeng Zhao,Zhiwei Lu,Zhiyong Zou
DOI: https://doi.org/10.1016/j.lwt.2022.114015
2022-01-01
Abstract:For rapid nondestructive detection of peanut protein content, an experimental method combining hyperspectral imaging technology and spectrophotometry was proposed. For data redundancy and noise analysis, ten algo-rithms were selected for feature extraction, and revealed that the optimal characteristic band of protein content was between 400 and 550 nm. According to the results, the median filtering algorithm (MF) was used to pre-process original spectral data, the XGBoost algorithm was used to extract the top 30 feature bands, the Ridge algorithm was used to construct the protein content prediction model, and the protein content physicochemical data were measured by spectrophotometry. The optimal model was MF-XGBoost-Ridge, with hyperparameter alpha tuning by Optuna algorithm, with RMSE = 0.009, and a correlation R = 0.886 with a fitting time of only 0.02 s. Compared with the traditional machine learning algorithm models, the prediction accuracy of this study was high and the fitting time was short.
What problem does this paper attempt to address?