Improved soil carbon stock spatial prediction in a Mediterranean soil erosion site through robust machine learning techniques
Hassan Mosaid,Ahmed Barakat,Kingsley John,Elhousna Faouzi,Vincent Bustillo,Mohamed El Garnaoui,Brandon Heung,Mosaid, Hassan
DOI: https://doi.org/10.1007/s10661-024-12294-x
IF: 3.307
2024-01-11
Environmental Monitoring and Assessment
Abstract:Soil serves as a reservoir for organic carbon stock, which indicates soil quality and fertility within the terrestrial ecosystem. Therefore, it is crucial to comprehend the spatial distribution of soil organic carbon stock (SOCS) and the factors influencing it to achieve sustainable practices and ensure soil health. Thus, the present study aimed to apply four machine learning (ML) models, namely, random forest (RF), k-nearest neighbors (kNN), support vector machine (SVM), and Cubist model tree (Cubist), to improve the prediction of SOCS in the Srou catchment located in the Upper Oum Er-Rbia watershed, Morocco. From an inventory of 120 sample points, 80% were used for training the model, with the remaining 20% set aside for model testing. Boruta's algorithm and the multicollinearity test identified only nine (9) factors as the controlling factors selected as input data for predicting SOCS. As a result, spatial distribution maps for SOCS were generated for all models, then compared, and further validated using statistical metrics. Among the models tested, the RF model exhibited the best performance ( R 2 = 0.76, RMSE = 0.52 Mg C/ha, NRMSE = 0.13, and MAE = 0.34 Mg C/ha), followed closely by the SVM model ( R 2 = 0.68, RMSE = 0.59 Mg C/ha, NRMSE = 0.15, and MAE = 0.34 Mg C/ha) and Cubist model ( R 2 = 0.64, RMSE = 0.63 Mg C/ha, NRMSE = 0.16, and MAE = 0.43 Mg C/ha), while the kNN model had the lowest performance ( R 2 = 0.31, RMSE = 0.94 Mg C/ha, NRMSE = 0.24, and MAE = 0.63 Mg C/ha). However, bulk density, pH, electrical conductivity, and calcium carbonate were the most important factors for spatially predicting SOCS in this semi-arid region. Hence, the methodology used in this study, which relies on ML algorithms, holds the potential for modeling and mapping SOCS and soil properties in comparable contexts elsewhere.
environmental sciences