Abstract:Few-shot class incremental learning illustrates the challenges of learning new concepts, where the learner can access only a small sample per concept. The standard incremental learning techniques cannot be applied directly because of the small number of samples for training. Moreover, catastrophic forgetting is the propensity of an Artificial Neural Network to fully and abruptly forget previously learned knowledge upon learning new knowledge. This problem happens due to a lack of supervision in older classes or an imbalance between the old and new classes. In this work, we propose a new distillation structure to tackle the forgetting and overfitting issues. Particularly, we suggest a dual distillation module that adaptably draws knowledge from two different but complementary teachers. The first teacher is the base model, which has been trained on large class data, and the second teacher is the updated model from the previous K-1 session, which contains the modified knowledge of previously observed new classes. Thus, the first teacher can reduce overfitting issues by transferring the knowledge obtained from the base classes to the new classes. While the second teacher can reduce knowledge forgetting by distilling knowledge from the previous model. Additionally, we use semantic information as word embedding to facilitate the distillation process. To align visual and semantic vectors, we used the attention mechanism of the embedding of visual data. With extensive experiments on different data sets such as Mini-ImageNet, CIFAR100, and CUB200, our model shows state-of-the-art performance compared to the existing few shot incremental learning methods.

Self-supervised Knowledge Distillation for Few-shot Learning

Class similarity weighted knowledge distillation for few shot incremental learning

SSL-ProtoNet: Self-supervised Learning Prototypical Networks for few-shot learning

Enhanced ProtoNet With Self-Knowledge Distillation for Few-Shot Learning

Self-supervised Multi-task Distillation for Few-shot Classification

Progressive Network Grafting for Few-Shot Knowledge Distillation

Self-Training Based Few-Shot Node Classification by Knowledge Distillation

Contrastive Knowledge-Augmented Self-Distillation Approach for Few-Shot Learning

Few-shot Class Incremental Learning Via Prompt Transfer and Knowledge Distillation

Integrating Knowledge Distillation with Learning to Rank for Few-Shot Scene Classification

SSL-DC: Improving Transductive Few-Shot Learning Via Self-Supervised Learning and Distribution Calibration.

Knowledge-Based Fine-Grained Classification for Few-Shot Learning.

Hierarchical Knowledge Propagation and Distillation for Few-Shot Learning.

Knowledge Distillation Using Hierarchical Self-Supervision Augmented Distribution

Self-supervised Network Evolution for Few-shot Classification

Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning

Semantic-aware Knowledge Distillation for Few-Shot Class-Incremental Learning

Few-Shot Object Detection Based on Self-Knowledge Distillation

Supervised Masked Knowledge Distillation for Few-Shot Transformers

Knowledge Distillation with Deep Supervision

Enhancing Few-Shot Learning in Lightweight Models via Dual-Faceted Knowledge Distillation