Abstract:Ensuring the reliability and safety of automated decision-making is crucial. It is well-known that data distribution shifts in machine learning can produce unreliable outcomes. This paper proposes a new approach for measuring the reliability of predictions under distribution shifts. We analyze how the outputs of a trained neural network change using clustering to measure distances between outputs and class centroids. We propose this distance as a metric to evaluate the confidence of predictions under distribution shifts. We assign each prediction to a cluster with centroid representing the mean softmax output for all correct predictions of a given class. We then define a safety threshold for a class as the smallest distance from an incorrect prediction to the given class centroid. We evaluate the approach on the MNIST and CIFAR-10 datasets using a Convolutional Neural Network and a Vision Transformer, respectively. The results show that our approach is consistent across these data sets and network models, and indicate that the proposed metric can offer an efficient way of determining when automated predictions are acceptable and when they should be deferred to human operators given a distribution shift.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the reliability and safety issues of automated decision - making in machine learning models under the situation of distribution shift. Specifically, the author proposes a new method to measure the reliability of neural network predictions, especially when the data distribution in the training data and in the actual application is inconsistent. #### The distribution shift problem Distribution shift means that the distribution of the training data is different from the data distribution encountered after the model is deployed, which may lead to a significant decline in model performance, especially in high - risk fields where wrong predictions may bring serious consequences. Distribution shift can be manifested in forms such as covariate shift, concept drift or domain shift. #### The proposed new method To solve this problem, the author proposes a method based on clustering and softmax distance to evaluate the reliability of predictions. The specific steps are as follows: 1. **Clustering and class center calculation**: - Calculate the class center (centroid) of each class using the correct prediction results in the training set. The class center represents the average softmax output of all correct predictions of this class. - For each prediction, calculate its distance from the corresponding class center. 2. **Define a safety threshold**: - For each class, define a safety threshold, that is, the minimum distance from the wrong prediction to the class center. This threshold is used to determine whether the model prediction is reliable enough. 3. **Evaluation method**: - The effectiveness of this method is verified by conducting experiments on the MNIST and CIFAR - 10 datasets using convolutional neural networks (CNN) and visual transformers (ViT) respectively. - The experimental results show that this method can effectively judge when to accept automated predictions and when human intervention is required. #### Main contributions - Propose a lightweight method that uses distance measurement and clustering techniques to quantify the reliability of neural networks in the case of distribution shift. - Combine metric - based and accuracy - based methods to deal with distribution shift and demonstrate the effectiveness of this method on different network architectures such as CNN and ViT. ### Summary The core problem of this paper is to ensure the reliability and safety of machine learning models when facing distribution shift. By introducing a new method based on clustering and softmax distance, the author provides an effective means to evaluate and improve the reliability of model predictions, thereby finding a balance between automated decision - making and human intervention.

When to Accept Automated Predictions and When to Defer to Human Judgment?

Evaluation of autonomous systems under data distribution shifts

Predicting with confidence – an improved dynamic cell structure

Prediction Accuracy & Reliability: Classification and Object Localization under Distribution Shift

Limits of Probabilistic Safety Guarantees when Considering Human Uncertainty

Robust Validation: Confident Predictions Even When Distributions Shift

Dependable Neural Networks for Safety Critical Tasks

A Holistic Assessment of the Reliability of Machine Learning Systems

Evaluation of Predictive Reliability to Foster Trust in Artificial Intelligence. A case study in Multiple Sclerosis

Enabling uncertainty estimation in neural networks through weight perturbation for improved Alzheimer's disease classification

Reliable Probabilistic Human Trajectory Prediction for Autonomous Applications

A Trustworthiness Score to Evaluate DNN Predictions

Uncertainty-Aware Prediction Validator in Deep Learning Models for Cyber-Physical System Data

Prediction Confidence from Neighbors

Towards Trustworthy Predictions from Deep Neural Networks with Fast Adversarial Calibration

Adaptive, Distribution-Free Prediction Intervals for Deep Networks

DC4L: Distribution Shift Recovery via Data-Driven Control for Deep Learning Models

How Reliable is Your Regression Model's Uncertainty Under Real-World Distribution Shifts?

Preventing Arbitrarily High Confidence on Far-Away Data in Point-Estimated Discriminative Neural Networks

Towards Safe Machine Learning for CPS: Infer Uncertainty from Training Data

Inadequacy of common stochastic neural networks for reliable clinical decision support