Tree-based Machine Learning Models for Enhanced Large-Scale Soil Mn Classification by Integrating Visible-Near Infrared Spectroscopy

Chongchong Qi,Min Zhou,Qiusong Chen,Tao Hu
DOI: https://doi.org/10.1007/s11368-024-03914-7
IF: 3.5361
2024-01-01
Journal of Soils and Sediments
Abstract:Given the growing concern over soil heavy metal contamination, there is an increasing need for affordable and precise soil heavy metal information. In particular, efficient and cost-effective methods for detecting soil manganese (Mn), a heavy metal element that is also essential for life processes, hold significant importance. This study employs tree-based machine learning (ML) algorithms with visible-near infrared (VNIR) spectroscopy to enable rapid, non-destructive soil Mn prediction at the continental scale, introducing a novel ML framework with significant implications for soil Mn management. Soil spectra were obtained using VNIR spectroscopy and preprocessed using a combination of smoothing and derivative techniques. Three tree-based ML models were constructed for soil Mn prediction: extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and random forest (RF). The spectral bands sensitive to soil Mn were then investigated using the optimal tree-based model. The most appropriate preprocessing methods for different tree-based models varied. The XGBoost model performed best, with an area under the curve value of 0.918. The most important bands for the XGBoost model’s soil Mn classification were 1408–1410.5 nm, 2323.5–2325.5 nm, and 2144–2147.5 nm. The main mechanism for the prediction using these bands is the covariant effect with spectrally active substances such as clay minerals, water, and organic compounds. This study demonstrates that the XGBoost model, when combined with appropriate preprocessing methods, is efficient for predicting soil Mn content. The sensitive spectral bands provide critical insights into the Mn-spectral feature correlations.
What problem does this paper attempt to address?