Black-box Unsupervised Domain Adaptation with Bi-directional Atkinson-Shiffrin Memory

Jingyi Zhang,Jiaxing Huang,Xueying Jiang,Shijian Lu
2023-08-25
Abstract:Black-box unsupervised domain adaptation (UDA) learns with source predictions of target data without accessing either source data or source models during training, and it has clear superiority in data privacy and flexibility in target network selection. However, the source predictions of target data are often noisy and training with them is prone to learning collapses. We propose BiMem, a bi-directional memorization mechanism that learns to remember useful and representative information to correct noisy pseudo labels on the fly, leading to robust black-box UDA that can generalize across different visual recognition tasks. BiMem constructs three types of memory, including sensory memory, short-term memory, and long-term memory, which interact in a bi-directional manner for comprehensive and robust memorization of learnt features. It includes a forward memorization flow that identifies and stores useful features and a backward calibration flow that rectifies features' pseudo labels progressively. Extensive experiments show that BiMem achieves superior domain adaptation performance consistently across various visual recognition tasks such as image classification, semantic segmentation and object detection.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the model training collapse problem caused by the noise in the pseudo - labels of the target data in Black - box Unsupervised Domain Adaptation (UDA). Specifically, the existing black - box UDA methods rely on the initial predictions of the target data by the source model during the training process, and these predictions often contain errors, leading to the "forgetting" phenomenon in the self - training process. That is, the model can better learn the information of the target domain in the early stage, but as the training progresses, the accumulated pseudo - label noise makes the model performance gradually decline, and even lower than the model trained only with the source - domain data. To overcome this challenge, the paper proposes BiMem, a two - way memory mechanism, aiming to correct the pseudo - label noise by constructing and calibrating three types of memories (sensory memory, short - term memory, and long - term memory), thereby achieving more stable and effective black - box UDA. ### Main Contributions 1. **General Framework**: Designed BiMem, a general black - box UDA framework applicable to different visual recognition tasks. To the best of the authors' knowledge, this is the first work to explore and benchmark black - box UDA on different visual recognition tasks. 2. **Memory Mechanism**: Designed three types of memories that interact in a two - way manner, reducing the "forgetting" of useful and representative features, improving the accuracy of the pseudo - labels of the target data, and thus achieving better adaptation effects in black - box UDA. 3. **Experimental Verification**: Extensive experiments on multiple benchmark datasets show that BiMem has achieved superior performance in computer vision tasks such as image classification, semantic segmentation, and object detection. ### Method Overview The core idea of BiMem is to solve the "forgetting" problem in black - box UDA by constructing and calibrating three types of memories: - **Sensory Memory**: Buffers the features of the current batch to capture fresh knowledge. - **Short - Term Memory**: Actively selects and stores difficult samples from the sensory memory, which usually have high classification uncertainty. - **Long - Term Memory**: Stores global and representative information by class - wise compression and accumulation of all features removed from the sensory memory and short - term memory. ### Memory Update and Calibration - **Forward Memory Flow**: Updates the sensory memory, short - term memory, and long - term memory to ensure the capture of fresh and representative information. - **Backward Calibration Flow**: Calibrates the short - term memory through the long - term memory, and jointly calibrates the sensory memory through the calibrated short - term memory and long - term memory, gradually correcting the pseudo - labels of the features. ### Experimental Results The paper conducted experiments on multiple visual tasks, including semantic segmentation (GTA5 → Cityscapes and SYNTHIA → Cityscapes), object detection (Cityscapes → Foggy Cityscapes and SYNTHIA → Cityscapes), and image classification (Office - Home and Office - 31). The experimental results show that BiMem significantly outperforms the existing black - box UDA methods on all tasks, especially in semantic segmentation and object detection tasks. ### Conclusion BiMem effectively solves the "forgetting" problem in black - box UDA by constructing and calibrating three types of memories, improving the adaptability and robustness of the model in different visual tasks. This method provides a new solution for black - box UDA and is expected to be widely used in practical applications.