An Efficient Model-Agnostic Approach for Uncertainty Estimation in Data-Restricted Pedometric Applications

Viacheslav Barkov,Jonas Schmidinger,Robin Gebbers,Martin Atzmueller

2024-09-18

Abstract:This paper introduces a model-agnostic approach designed to enhance uncertainty estimation in the predictive modeling of soil properties, a crucial factor for advancing pedometrics and the practice of digital soil mapping. For addressing the typical challenge of data scarcity in soil studies, we present an improved technique for uncertainty estimation. This method is based on the transformation of regression tasks into classification problems, which not only allows for the production of reliable uncertainty estimates but also enables the application of established machine learning algorithms with competitive performance that have not yet been utilized in pedometrics. Empirical results from datasets collected from two German agricultural fields showcase the practical application of the proposed methodology. Our results and findings suggest that the proposed approach has the potential to provide better uncertainty estimation than the models commonly used in pedometrics.

Machine Learning

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve The paper aims to address the issue of uncertainty estimation in soil property prediction, particularly in situations with limited data. Specifically: 1. **Data Scarcity**: Soil studies often face the problem of insufficient sample sizes because soil sampling is an expensive and time-consuming process. This results in smaller training datasets, limiting the performance of predictive models. 2. **Uncertainty Estimation**: In digital soil mapping, the uncertainty of model predictions is a critical factor, especially when users such as farmers rely on these predictions for decision-making. Reliable uncertainty measures are essential for building confidence in model outputs and supporting informed actions. To address these issues, the paper proposes a model-agnostic approach to estimate uncertainty, enabling models to directly output uncertainty estimates without the need for additional calibration datasets. This approach not only avoids further reducing the size of the training dataset, which is advantageous in data-scarce situations, but also introduces machine learning algorithms that have not yet been applied in the field of soil science. ### Method Overview The paper proposes a general adapter that converts regression tasks into classification problems, thereby utilizing classification algorithms for regression. The specific steps are as follows: 1. **Target Discretization**: Divide the continuous target variable into multiple intervals (referred to as "bins"). 2. **Classification Model Training**: Train a classification model to minimize categorical cross-entropy. 3. **Continuous Prediction Reconstruction**: Reconstruct continuous predictions from the output probabilities of the trained classifier. 4. **Model Uncertainty Estimation**: Calculate the standard deviation of the bin structures as a proxy for the model's intrinsic uncertainty. Additionally, the paper employs an ensemble method that combines model predictions under different bin sizes and strategies to enhance the robustness and reliability of the results. ### Experimental Results The experimental results demonstrate that the proposed Binned Uncertainty Estimation Ensemble method performs excellently on datasets from two agricultural sites in Germany, particularly in the uncertainty estimation of SOC (Soil Organic Carbon) predictions. The method shows the best results when combined with TabPFN and CatBoost models. It demonstrates its effectiveness through the lowest CRPS values and provides a visual representation of prediction reliability.

An Efficient Model-Agnostic Approach for Uncertainty Estimation in Data-Restricted Pedometric Applications

Uncertainty Analysis in Statistical Modeling of Extreme Hydrological Events

Quantification of data‐related uncertainty of spatially dense soil moisture patterns on the small catchment scale estimated using unsupervised multiple regression

Geostatistical modelling of uncertainty in soil science

How can we quantify, explain, and apply the uncertainty of complex soil maps predicted with neural networks?

Quantifying uncertainty in area and regression coefficient estimation from remote sensing maps

Uncertainty assessment of grassland aboveground biomass using quantile regression forests

Towards a Better Uncertainty Quantification in Automated Valuation Models

Prediction and Uncertainty Capabilities of Quantile Regression Forests in Estimating Spatial Distribution of Soil Organic Matter

An Adaptive Uncertainty-Guided Sampling Method for Geospatial Prediction and Its Application in Digital Soil Mapping

Agricultural decision-making under uncertainty: a loss function on the kriging variance from soil properties predicted by infrared and X-ray fluorescence spectroscopy

Model Agnostic Explainable Selective Regression via Uncertainty Estimation

Predicting uncertainty of machine learning models for modelling nitrate pollution of groundwater using quantile regression and UNEEC methods

Uncertainty-Aware Regression for Socio-Economic Estimation via Multi-View Remote Sensing

A loss function to evaluate agricultural decision-making under uncertainty: a case study of soil spectroscopy

Uncertainty Quantification of Soil Organic Carbon Estimation from Remote Sensing Data with Conformal Prediction

Quantifying Distribution Shifts and Uncertainties for Enhanced Model Robustness in Machine Learning Applications

Uncertainty assessment for watershed water quality modeling: A Probabilistic Collocation Method based approach

Comparing Nonlinear Regression and Markov Chain Monte Carlo Methods for Assessment of Prediction Uncertainty in Vadose Zone Modeling

A novel framework for debris flow susceptibility assessment considering the uncertainty of sample selection

Addressing uncertainty in Participatory Integrated Assessment: qualitative modeling approach for risk estimation