Abstract:Our goal in this paper is to exploit heteroscedastic temperature scaling as a calibration strategy for out of distribution (OOD) detection. Heteroscedasticity here refers to the fact that the optimal temperature parameter for each sample can be different, as opposed to conventional approaches that use the same value for the entire distribution. To enable this, we propose a new training strategy called anchoring that can estimate appropriate temperature values for each sample, leading to state-of-the-art OOD detection performance across several benchmarks. Using NTK theory, we show that this temperature function estimate is closely linked to the epistemic uncertainty of the classifier, which explains its behavior. In contrast to some of the best-performing OOD detection approaches, our method does not require exposure to additional outlier datasets, custom calibration objectives, or model ensembling. Through empirical studies with different OOD detection settings -- far OOD, near OOD, and semantically coherent OOD - we establish a highly effective OOD detection approach. Code to reproduce our results is available at <a class="link-external link-http" href="http://github.com/LLNL/AMP" rel="external noopener nofollow">this http URL</a>

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of **Out - of - Distribution (OOD) detection**. Specifically, the author hopes to use **Heteroscedastic Temperature Scaling** as a calibration strategy to improve the performance of OOD detection. #### Background and problem description In machine learning, models are usually trained on a specific data distribution. However, in practical applications, the model may encounter samples with a different distribution from the training data, and these samples are called out - of - distribution samples (OOD). Traditional OOD detection methods usually rely on some scoring functions (such as maximum softmax probability, entropy, etc.), but these methods often perform poorly in practice, especially when dealing with complex OOD scenarios. #### Core contributions of the paper 1. **Heteroscedastic Temperature Scaling**: Different from the traditional homoscedastic temperature scaling, the author proposes a new method, that is, estimating a specific temperature parameter for each sample instead of using the same temperature value to scale all samples. This method can better adapt to the uncertainty of different samples, thereby improving the accuracy of OOD detection. 2. **Neural Network Anchoring**: In order to implement heteroscedastic temperature scaling, the author introduces a new training strategy - neural network anchoring. By converting the input image into a form containing anchor points and residuals and combining the consistency training strategy, the model can estimate the temperature value of each sample according to the prediction results of multiple random anchor points in the inference stage. 3. **Theoretical explanation**: The author uses the Neural Tangent Kernel (NTK) theory to prove that the proposed heteroscedastic temperature estimation is closely related to the representational uncertainty of the model, which explains the reason for its effectiveness. 4. **Experimental verification**: Through extensive experiments on multiple benchmark datasets, the author shows that this method can achieve state - of - the - art performance in various OOD detection settings (such as far - OOD, near - OOD, and semantically coherent OOD). #### Formula representation - The formula for heteroscedastic temperature scaling is as follows: \[ H(y|x)=\text{MEAN}[f_A([c_k, x - c_k])]_{k = 1}^K \] \[ \tau(x)=\sum_{\text{all classes}}\text{STD - DEV}[\sigma(f_A([c_k, x - c_k]))]_{k = 1}^K \] where \(H(y|x)\) represents the logits of each category, \(\tau(x)\) is the temperature value of sample \(x\), \(f_A\) is the model prediction function after anchoring transformation, and \(c_k\) is a randomly selected anchor point. - The formula for the calibrated prediction score: \[ H_c(y|x)=\frac{H(y|x)}{\tau(x)} \] - The AMP scoring function: \[ \text{AMP}(x)=-\frac{1}{N}\sum_{\text{all classes}}\log(\text{SOFTMAX}(H_c(y|x))) \] Through these improvements, this paper provides a more robust and efficient OOD detection method, especially suitable for application scenarios that do not require additional abnormal datasets or complex calibration targets.

Out of Distribution Detection via Neural Network Anchoring

Out-of-Distribution Detection & Applications With Ablated Learned Temperature Energy

Density-driven Regularization for Out-of-distribution Detection

Calibrated Out-of-Distribution Detection with a Generic Representation

Non-Linear Outlier Synthesis for Out-of-Distribution Detection

A Simple Test-Time Method for Out-of-Distribution Detection

Out-of-Distribution Detection with Deep Nearest Neighbors

Rethinking Out-of-Distribution Detection on Imbalanced Data Distribution

Certifiably Adversarially Robust Detection of Out-of-Distribution Data

Scaling for Training Time and Post-hoc Out-of-distribution Detection Enhancement

Continual Unsupervised Out-of-Distribution Detection

Out-of-Distribution Detection Using Outlier Detection Methods

Advancing Out-of-Distribution Detection through Data Purification and Dynamic Activation Function Design

Mahalanobis-Aware Training for Out-of-Distribution Detection

Out-of-Distribution Detection using Neural Activation Prior

Are all outliers alike? On Understanding the Diversity of Outliers for Detecting OODs

OAL: Enhancing OOD Detection Using Latent Diffusion

Towards In-Distribution Compatible Out-of-Distribution Detection.

Deep Discriminative to Kernel Density Graph for In- and Out-of-distribution Calibrated Inference

Going Beyond Conventional OOD Detection

Expecting The Unexpected: Towards Broad Out-Of-Distribution Detection