Abstract:Test-time adaptation (TTA) seeks to tackle potential distribution shifts between training and test data by adapting a given model w.r.t. any test sample. Although recent TTA has shown promising performance, we still face two key challenges: 1) prior methods perform backpropagation for each test sample, resulting in unbearable optimization costs to many applications; 2) while existing TTA can significantly improve the test performance on out-of-distribution data, they often suffer from severe performance degradation on in-distribution data after TTA (known as forgetting). To this end, we have proposed an Efficient Anti-Forgetting Test-Time Adaptation (EATA) method which develops an active sample selection criterion to identify reliable and non-redundant samples for test-time entropy minimization. To alleviate forgetting, EATA introduces a Fisher regularizer estimated from test samples to constrain important model parameters from drastic changes. However, in EATA, the adopted entropy loss consistently assigns higher confidence to predictions even for samples that are underlying uncertain, leading to overconfident predictions. To tackle this, we further propose EATA with Calibration (EATA-C) to separately exploit the reducible model uncertainty and the inherent data uncertainty for calibrated TTA. Specifically, we measure the model uncertainty by the divergence between predictions from the full network and its sub-networks, on which we propose a divergence loss to encourage consistent predictions instead of overconfident ones. To further recalibrate prediction confidence, we utilize the disagreement among predicted labels as an indicator of the data uncertainty, and then devise a min-max entropy regularizer to selectively increase and decrease prediction confidence for different samples. Experiments on image classification and semantic segmentation verify the effectiveness of our methods.

What problem does this paper attempt to address?

The paper mainly addresses the issue of performance degradation in deep neural networks when the test data distribution is inconsistent with the training data distribution, and proposes two methods: Efficient Anti-forgetting Test-time Adaptation (EATA) and EATA with Calibration (EATA-C). 1. **EATA** Method: - **Objective**: To improve the efficiency of test-time adaptation (TTA) and address the issue of "catastrophic forgetting" caused by existing TTA strategies. - **Technical Means**: - **Sample-efficient Entropy Minimization**: By using an active sample selection strategy to reduce unnecessary backpropagation, the overall TTA efficiency is improved. Specifically, prediction entropy is used to identify reliable samples, and redundant samples are excluded to further enhance efficiency. - **Anti-forgetting Weight Regularization**: By introducing an importance-aware regularizer (based on the Fisher information matrix), it ensures that parameters important to the ID domain do not undergo drastic changes during TTA, thereby mitigating "catastrophic forgetting." 2. **EATA-C** Method: - **Objective**: To further address the issue of overconfident predictions, where the model gives high-confidence predictions even for uncertain data. - **Technical Means**: - **Model Uncertainty Reduction**: By measuring the discrepancy between the predictions of the full network and its sub-networks to estimate model uncertainty, and introducing a consistency loss to reduce this uncertainty, overconfident predictions are avoided. - **Prediction Uncertainty Recalibration**: Using the inconsistency between predicted labels as an indicator of data uncertainty, a min-max entropy regularizer is designed to selectively adjust the prediction confidence based on the inherent data uncertainty of each sample. In summary, the methods proposed in the paper aim to improve the efficiency, stability, and accuracy of test-time adaptation, especially in handling distribution shifts, effectively enhancing the model's generalization ability to unknown data while maintaining performance on the original data.

Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting

Efficient Test-Time Model Adaptation without Forgetting

Towards Test Time Adaptation Via Calibrated Entropy Minimization

Confidence-based and sample-reweighted test-time adaptation

Reliable Test-Time Adaptation via Agreement-on-the-Line

Test-time Adaptation Meets Image Enhancement: Improving Accuracy via Uncertainty-aware Logit Switching

REALM: Robust Entropy Adaptive Loss Minimization for Improved Single-Sample Test-Time Adaptation

Universal Test-time Adaptation through Weight Ensembling, Diversity Weighting, and Prior Correction

Robust gradient aware and reliable entropy minimization for stable test-time adaptation in dynamic scenarios

AETTA: Label-Free Accuracy Estimation for Test-Time Adaptation

COME: Test-time adaption by Conservatively Minimizing Entropy

ETAGE: Enhanced Test Time Adaptation with Integrated Entropy and Gradient Norms for Robust Model Performance

Fully Test-time Adaptation by Entropy Minimization

Improving Entropy-Based Test-Time Adaptation from a Clustering View

Feature Augmentation based Test-Time Adaptation

Distribution Alignment for Fully Test-Time Adaptation with Dynamic Online Data Streams

Towards Stable Test-time Adaptation in Dynamic Wild World

On Pitfalls of Test-Time Adaptation

Generalized Robust Test-Time Adaptation in Continuous Dynamic Scenarios

Resilient Practical Test-Time Adaptation: Soft Batch Normalization Alignment and Entropy-driven Memory Bank

Bag of Tricks for Fully Test-Time Adaptation