On the Robustness of Open-World Test-Time Training: Self-Training with Dynamic Prototype Expansion

Yushu Li,Xun Xu,Yongyi Su,Kui Jia
2023-08-19
Abstract:Generalizing deep learning models to unknown target domain distribution with low latency has motivated research into test-time training/adaptation (TTT/TTA). Existing approaches often focus on improving test-time training performance under well-curated target domain data. As figured out in this work, many state-of-the-art methods fail to maintain the performance when the target domain is contaminated with strong out-of-distribution (OOD) data, a.k.a. open-world test-time training (OWTTT). The failure is mainly due to the inability to distinguish strong OOD samples from regular weak OOD samples. To improve the robustness of OWTTT we first develop an adaptive strong OOD pruning which improves the efficacy of the self-training TTT method. We further propose a way to dynamically expand the prototypes to represent strong OOD samples for an improved weak/strong OOD data separation. Finally, we regularize self-training with distribution alignment and the combination yields the state-of-the-art performance on 5 OWTTT benchmarks. The code is available at <a class="link-external link-https" href="https://github.com/Yushu-Li/OWTTT" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The paper primarily focuses on the challenges faced in Open-World Test-Time Training (OWTTT), especially when the target domain data is contaminated with Strong Out-of-Distribution (Strong OOD) samples. ### Research Background and Problem Definition - **Test-Time Training (TTT)**: A method that allows a pre-trained model to adapt to unknown target domain data during the inference phase without accessing source domain data. - **Problems with Existing Methods**: Many existing TTT methods perform poorly when handling Strong OOD samples in the target domain data. These samples may come from different semantic categories or just random noise, making it difficult for the model to distinguish normal Weak OOD samples. - **Specific Challenges**: - Self-Training methods struggle to handle Strong OOD samples correctly because they need to assign test samples to known categories. - Distribution alignment-based methods are also affected when Strong OOD samples are included in the estimation of the target domain distribution. ### Solution Overview The paper proposes a two-stage method to improve the robustness of OWTTT: 1. **Strong OOD Sample Pruning**: - Proposes a method to identify and exclude Strong OOD samples without requiring hyperparameters, reducing their negative impact on the self-training process. - Uses a dynamic threshold to distinguish between Strong OOD and Weak OOD samples. 2. **Prototype Expansion**: - Dynamically expands the prototype pool to include new prototypes representing Strong OOD samples. - This allows Strong OOD samples to form tighter clusters in the feature space, better separating Weak OOD samples from Strong OOD samples. Additionally, the paper incorporates distribution alignment as a regularization term to further enhance the model's robustness and proposes a benchmark covering multiple types of domain shifts to evaluate the OWTTT protocol. ### Main Contributions - Identifies an important issue overlooked in existing TTT research—OWTTT may fail in the presence of Strong OOD samples. - Proposes a prototype clustering-based baseline method and develops a Strong OOD detector and prototype expansion technique to improve robustness under OWTTT. - Establishes a benchmark covering various types of domain shifts, including common corruptions and style transfers, achieving state-of-the-art performance on the proposed benchmark.