Historical Test-time Prompt Tuning for Vision Foundation Models

Jingyi Zhang,Jiaxing Huang,Xiaoqin Zhang,Ling Shao,Shijian Lu
2024-10-27
Abstract:Test-time prompt tuning, which learns prompts online with unlabelled test samples during the inference stage, has demonstrated great potential by learning effective prompts on-the-fly without requiring any task-specific annotations. However, its performance often degrades clearly along the tuning process when the prompts are continuously updated with the test data flow, and the degradation becomes more severe when the domain of test samples changes continuously. We propose HisTPT, a Historical Test-time Prompt Tuning technique that memorizes the useful knowledge of the learnt test samples and enables robust test-time prompt tuning with the memorized knowledge. HisTPT introduces three types of knowledge banks, namely, local knowledge bank, hard-sample knowledge bank, and global knowledge bank, each of which works with different mechanisms for effective knowledge memorization and test-time prompt optimization. In addition, HisTPT features an adaptive knowledge retrieval mechanism that regularizes the prediction of each test sample by adaptively retrieving the memorized knowledge. Extensive experiments show that HisTPT achieves superior prompt tuning performance consistently while handling different visual recognition tasks (e.g., image classification, semantic segmentation, and object detection) and test samples from continuously changing domains.
Computer Vision and Pattern Recognition,Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of performance degradation in the continuous update process of **Test - time Prompt Tuning (TPT)**. Specifically, when prompts are continuously updated with the test data stream, existing methods often experience significant performance degradation, especially when the domain of test samples keeps changing. This degradation is mainly due to the lack of annotation information of test samples and knowledge forgetting caused by accumulated prediction errors. #### Background and Challenges 1. **Limitations of Existing TPT Methods**: - Existing TPT methods usually start from an initial template prompt (for example, “a photo of a [class]”) and use a self - supervised objective function to optimize the test image and its model prediction. - As the prompt is continuously updated, these methods tend to gradually forget the useful knowledge learned from test samples before, resulting in performance degradation. - When the domain of test samples changes, this performance degradation is particularly obvious. 2. **Knowledge Forgetting Problem**: - Existing methods can learn effective prompts in the early stage, but as the tuning process continues, the learned prompts will gradually deteriorate and may even be worse than the initial template prompt. - The reason for this phenomenon is that in the absence of labels, the accumulated prediction errors lead to knowledge forgetting. #### Proposed Solution To solve the above problems, the author proposes **Historical Test - time Prompt Tuning (HisTPT)**, which alleviates the knowledge forgetting problem by introducing three types of knowledge banks to remember the useful knowledge learned before. Specifically: 1. **Local Knowledge Bank**: - Stores the features of the most recent test samples to capture the latest distribution changes. 2. **Hard - sample Knowledge Bank**: - Identifies and stores the hard - sample features from the local knowledge bank to capture difficult and rare situations. 3. **Global Knowledge Bank**: - Accumulates the features from the local knowledge bank and the hard - sample knowledge bank and stores global and representative information. In addition, HisTPT also introduces an **Adaptive Knowledge Retrieval Mechanism**, which can adaptively retrieve the memorized knowledge for each test sample for prediction regularization and prompt optimization. #### Main Contributions 1. **Designed the HisTPT Framework**: For the first time, explored the test - time prompt tuning method based on memory learning. 2. **Constructed Three Complementary Knowledge Banks**: Store local, hard - sample, and global information respectively, and introduced an adaptive knowledge retrieval mechanism to alleviate the knowledge forgetting problem. 3. **Experimentally Proved Superior Performance**: Conducted extensive experiments on multiple benchmark datasets, and the results show that HisTPT performs excellently in different visual recognition tasks (such as image classification, semantic segmentation, and object detection), especially when the domain of test samples keeps changing. Through these improvements, HisTPT can maintain higher performance and robustness in the test - time prompt tuning process.