Anytime Continual Learning for Open Vocabulary Classification

Zhen Zhu,Yiming Gong,Derek Hoiem
2024-09-13
Abstract:We propose an approach for anytime continual learning (AnytimeCL) for open vocabulary image classification. The AnytimeCL problem aims to break away from batch training and rigid models by requiring that a system can predict any set of labels at any time and efficiently update and improve when receiving one or more training samples at any time. Despite the challenging goal, we achieve substantial improvements over recent methods. We propose a dynamic weighting between predictions of a partially fine-tuned model and a fixed open vocabulary model that enables continual improvement when training samples are available for a subset of a task's labels. We also propose an attention-weighted PCA compression of training features that reduces storage and computation with little impact to model accuracy. Our methods are validated with experiments that test flexibility of learning and inference. Code is available at <a class="link-external link-https" href="https://github.com/jessemelpolio/AnytimeCL" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the continuous learning problem in open - vocabulary image classification, that is, how to efficiently update and improve the model when receiving new labeled data, and maintain the prediction ability for any label set. Specifically, the paper focuses on: 1. **Breaking the limitations of batch - training and fixed models**: The system is required to be able to predict any label set at any point in time, and be able to efficiently update and improve when receiving one or more training samples. 2. **Improving the performance of open - vocabulary classification**: Although existing open - vocabulary models (such as CLIP) can be trained on large - scale Internet data, their performance on many tasks is still not satisfactory. Therefore, the paper aims to continuously improve the performance of these models through continuous learning. 3. **Achieving "Anytime" Continual Learning (AnytimeCL)**: Ensure that the system can be quickly updated after receiving new samples at any time, and maintain the prediction ability for any label set throughout the process. To achieve the above goals, the authors propose the following methods: - **Dynamic weighted prediction**: Combine the prediction results of the partially fine - tuned model and the fixed open - vocabulary model, and achieve continuous improvement through dynamic weighting. - **Attention - weighted PCA compression**: Compress the training features to reduce storage and computational overhead while maintaining the accuracy of the model. - **Partial fine - tuning**: Only fine - tune the last transformer block of the model and keep the label embeddings unchanged, so as to retain general features while improving specific tasks. - **Loss function modification**: Introduce a new loss term, allowing the model to predict "none of the above" when there is no true label in the candidate label set, thereby improving the overall performance. Through these methods, the paper has verified its flexibility and effectiveness in multiple experiments, especially in data - incremental, class - incremental and task - incremental learning scenarios, and has achieved significant performance improvements.