Estimating soil organic carbon using sentinel-2 data under zero tillage agriculture: a machine learning approach

Lawrence Mango,Nuthammachot Narissara,Som-ard Jaturong
DOI: https://doi.org/10.1007/s12145-024-01427-y
2024-09-02
Earth Science Informatics
Abstract:Soil organic carbon (SOC) is the main component of soil organic matter (SOM) and constitutes the crucial component of the soil. It supports key soil functions, stabilizes soil structure, aid in plant-nutrient retention and release, and promote water infiltration and storage. Predicting SOC using Sentinel-2 data integrated with machine learning algorithms under zero tillage practice is inadequately documented for developing countries like Zimbabwe. The purpose of this study is to evaluate the performance of support vector machine (SVM), artificial neural network (ANN), and partial least square regression (PLSR) algorithms from Sentinel-2 data for SOC estimation. The SVM, ANN and PLSR models were used with a cross-validation to estimate the SOC content based on 50 georeferenced calibration samples under a zero-tillage practice. The ANN model outperformed the other two models by delivering a coefficient of determination (R 2 ) of between 55 and 60% of SOC variability and RMSE varied between 5.01 and 8.78%, whereas for the SVM, R 2 varied between 0.53 and 0.57 and RMSE varied between 6.25 and 11.39%. The least estimates of SOC provided by the PLSR algorithm were, R 2 = 0.44–0.49 and RMSE = 7.59–12.42% for the top 15 cm depth. Results with and R 2 , root mean square error (RMSE) and mean absolute error (MAE) for SVM, ANN and PLSR, show that the ANN model is highly capable for capturing SOC variability. Although the ANN algorithm provides more accurate SOC estimates than the SVM algorithm, the difference in accuracy is not significant. Results revealed a satisfactory agreement between the SOC content and zero tillage practice (R 2 , coefficient of variation (CV), MAE, and RMSE using SVM, ANN and PLSR for the validation dataset using four predictor variables. The calibration results of SOC indicated that the mean SOC was 15.83% and the validation mean SOC was 17.02%. The SOC validation dataset (34.17%) had higher degree of variation around its mean as compared to the calibration dataset (29.86%). The SOC prediction results can be used as an important tool for informed decisions about soil health and productivity by the farmers, land managers and policy makers.
geosciences, multidisciplinary,computer science, interdisciplinary applications
What problem does this paper attempt to address?