Abstract:Real-world application models are commonly deployed in dynamic environments, where the target domain distribution undergoes temporal changes. Continual Test-Time Adaptation (CTTA) has recently emerged as a promising technique to gradually adapt a source-trained model to continually changing target domains. Despite recent advancements in addressing CTTA, two critical issues remain: 1) Fixed thresholds for pseudo-labeling in existing methodologies generate low-quality pseudo-labels, as model confidence varies across categories and domains; 2) Stochastic parameter restoration methods for mitigating catastrophic forgetting fail to effectively preserve critical information due to their intrinsic randomness. To tackle these challenges for detection models in CTTA scenarios, we present CTAOD, featuring three core components. Firstly, the object-level contrastive learning module extracts object-level features for contrastive learning to refine the feature representation in the target domain. Secondly, the adaptive monitoring module dynamically skips unnecessary adaptation and updates the category-specific threshold based on predicted confidence scores to enable efficiency and improve the quality of pseudo-labels. Lastly, the data-driven stochastic restoration mechanism selectively reset inactive parameters with higher possibilities, ensuring the retention of essential knowledge. We demonstrate the effectiveness of CTAOD on four CTTA object detection tasks, where CTAOD outperforms existing methods, especially achieving a 3.2 mAP improvement and a 20% increase in efficiency on the Cityscapes-to-Cityscapes-C CTTA task. The code will be released.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how the object detection model adapts at test - time in a continuously changing environment (Continual Test - Time Adaptation, CTTA). Specifically, the paper mainly focuses on the following two challenges: 1. **Error accumulation caused by low - quality pseudo - labels**: Existing methods use a fixed threshold for pseudo - label generation, which will lead to the neglect of the model confidence differences between different classes and domains, thus generating low - quality pseudo - labels. These low - quality pseudo - labels will introduce errors and cause the model performance to decline. 2. **Effectively retaining current - domain knowledge and alleviating source - knowledge forgetting in a dynamic environment**: Existing methods are prone to catastrophic forgetting when facing continuous distribution changes, that is, the model forgets the previously learned knowledge and it is difficult to effectively retain the important information of the current domain. To solve the above problems, the authors propose CTAOD (Continual Test - time Adaption for Object Detection), which contains three core components: - **Object - level Contrastive Learning (OCL)**: By extracting object - level features for contrastive learning to improve the feature representation in the target domain. - **Adaptive Monitoring (AM)**: Dynamically skip unnecessary adaptation processes and update the class - specific thresholds according to the predicted confidence scores to improve the quality and efficiency of pseudo - labels. - **Data - driven Stochastic Restoration (DSR)**: Selectively reset inactive parameters to ensure the retention of key knowledge and enhance the robustness to forgetting. ### Formula Summary 1. **Contrastive learning loss function**: \[ L_{cl}(x_t)=-\frac{1}{l}\sum_{i = 1}^{l}\log\frac{\exp(f_T^i\cdot f_S^i/\tau)}{\sum_{j = 1}^{l}\exp(f_T^i\cdot f_S^j/\tau)} \] where \(f_T^i\) and \(f_S^i\) are the features extracted by the teacher and student models from different augmented views of the same object respectively, \(\tau\) is the temperature parameter, and \(l\) is the number of features. 2. **KL - divergence loss function**: \[ L_{kl}(P\|Q)=\sum_{x\in X}P(x)\log\frac{P(x)}{Q(x)} \] which is used to quantify the difference between two probability distributions. 3. **Exponential moving average update of the Adaptive Monitoring module**: \[ l_{ema}^t\leftarrow\beta_s\cdot l_{ema}^{t - 1}+(1-\beta_s)\cdot l_t \] where \(\beta_s\) is the update rate and \(l_t\) is the average prediction score of all classes. 4. **Dynamic threshold update formula**: \[ \delta_c^t\leftarrow\beta_t\cdot\delta_c^{t - 1}+(1-\beta_t)\cdot\epsilon\cdot(l_c^t)^{1/2} \] where \(\delta_c^t\) is the threshold of class \(c\) at time step \(t\), \(\beta_t\) is the update rate, \(\epsilon\) provides a linear projection, and \(l_c^t\) is the average prediction score of class \(c\). 5. **Data - driven Stochastic Restoration mechanism**: \[ M_t = F_t<\eta \]

Exploring Test-Time Adaptation for Object Detection in Continually Changing Environments

Reshaping the Online Data Buffering and Organizing Mechanism for Continual Test-Time Adaptation

Distribution-Aware Continual Test-Time Adaptation for Semantic Segmentation

Hybrid-TTA: Continual Test-time Adaptation via Dynamic Domain Shift Detection

STFAR: Improving Object Detection Robustness at Test-Time by Self-Training with Feature Alignment Regularization

Controllable Continual Test-Time Adaptation

Weakly Supervised Test-Time Domain Adaptation for Object Detection

Continual Test-time Domain Adaptation via Dynamic Sample Selection

PALM: Pushing Adaptive Learning Rate Mechanisms for Continual Test-Time Adaptation

Multi-Modal Continual Test-Time Adaptation for 3D Semantic Segmentation

A Versatile Framework for Continual Test-Time Domain Adaptation: Balancing Discriminability and Generalizability

Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation

Test-time Adaptation in the Dynamic World with Compound Domain Knowledge Management

Better Regression Makes Better Test-time Adaptive 3D Object Detection

MLFA: Toward Realistic Test Time Adaptive Object Detection by Multi-Level Feature Alignment

PCoTTA: Continual Test-Time Adaptation for Multi-Task Point Cloud Understanding

Analytic Continual Test-Time Adaptation for Multi-Modality Corruption

Test-time adaptation for geospatial point cloud semantic segmentation with distinct domain shifts

Fully Test-Time Adaptation for Monocular 3D Object Detection

Parameter-Selective Continual Test-Time Adaptation

ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation