A stacking ensemble model for predicting soil organic carbon content based on visible and near-infrared spectroscopy

Ke Tang,Xing Zhao,Zong Xu,Huojiao Sun
DOI: https://doi.org/10.1016/j.infrared.2024.105404
IF: 2.997
2024-06-17
Infrared Physics & Technology
Abstract:The content of soil organic carbon (SOC) plays an important role in maintaining ecosystem functions, protecting soil biodiversity, and understanding carbon cycling processes. The combination of visible and near-infrared spectroscopy (VIS–NIRS) and machine learning can achieve rapid prediction of soil organic carbon content. However, it is still relatively unknown how to integrate the characteristics of various machine learning models to improve the performance of SOC prediction models. In this study, a new model for predicting SOC content based on stacking ensemble learning was proposed by using VIS–NIRS. The prediction performances of six different models including Support Vector Regression (SVR), Extreme Gradient Boosting (XGBoost), Random Forest (RF), Light Gradient Boosting Machine (LightGBM), Partial Least-square (PLS) and Extreme Learning Machine (ELM) on SOC content under different spectral preprocessing methods were compared. The results indicated that SVR, XGBoost, and LightGBM models provide better prediction performance after first-order derivative preprocessing. After comparing the performance of various combinations of base models applied to the first layer of a stacking ensemble model, the results showed that both the combination of XGBoost, LightGBM, and SVR models and the combination of SVR, ELM, and LightGBM models achieve the best performance. The coefficient of determination ( R 2 ) of the stacking ensemble model on the test set reaches 0.84, which improves the accuracy of the model compared with the traditional single model. The stability of the stacking ensemble model was verified by applying it to datasets of different sizes, which can replace traditional machine learning models in predicting SOC content.
optics,physics, applied,instruments & instrumentation
What problem does this paper attempt to address?