Few-Shot Class-Incremental Learning with Prior Knowledge

Wenhao Jiang,Duo Li,Menghan Hu,Guangtao Zhai,Xiaokang Yang,Xiao-Ping Zhang
2024-02-02
Abstract:To tackle the issues of catastrophic forgetting and overfitting in few-shot class-incremental learning (FSCIL), previous work has primarily concentrated on preserving the memory of old knowledge during the incremental phase. The role of pre-trained model in shaping the effectiveness of incremental learning is frequently underestimated in these studies. Therefore, to enhance the generalization ability of the pre-trained model, we propose Learning with Prior Knowledge (LwPK) by introducing nearly free prior knowledge from a few unlabeled data of subsequent incremental classes. We cluster unlabeled incremental class samples to produce pseudo-labels, then jointly train these with labeled base class samples, effectively allocating embedding space for both old and new class data. Experimental results indicate that LwPK effectively enhances the model resilience against catastrophic forgetting, with theoretical analysis based on empirical risk minimization and class distance measurement corroborating its operational principles. The source code of LwPK is publicly available at: \url{
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problems that this paper attempts to solve are the catastrophic forgetting and overfitting problems encountered in few - shot class - incremental learning (FSCIL). Specifically, when the model is trained on a small amount of data of new classes, it is easy to forget the previously learned knowledge, and due to the limited amount of data, the model may overfit these new data. These problems are very common in practical applications, especially in the case where data arrives in a stream, old data may not be reusable, and new data often has labels that are difficult to obtain. To solve these problems, the author proposes a method named "Learning with Prior Knowledge (LwPK)". This method enhances the generalization ability of the pre - trained model by introducing a small part of unlabeled data from subsequent incremental classes as prior knowledge. The specific steps include: 1. **Pseudo - label Generation**: Generate pseudo - labels for unlabeled incremental class samples through a clustering algorithm. 2. **Joint Training**: Train the incremental class samples with pseudo - labels together with the labeled base class samples, thereby effectively allocating the embedding space for old and new classes. 3. **Theoretical Analysis**: Based on empirical risk minimization and inter - class distance measurement, the effectiveness of LwPK is theoretically verified. The experimental results show that LwPK can effectively improve the performance of the model on new classes while reducing the occurrence of catastrophic forgetting. The author conducted experiments on multiple benchmark datasets such as CIFAR100, CUB200, and miniImageNet to verify the effectiveness of LwPK.