Abstract:Human intelligence gradually accepts new information and accumulates knowledge throughout the lifespan. However, deep learning models suffer from a catastrophic forgetting phenomenon, where they forget previous knowledge when acquiring new information. Class-Incremental Learning aims to create an integrated model that balances plasticity and stability to overcome this challenge. In this paper, we propose a selective regularization method that accepts new knowledge while maintaining previous knowledge. We first introduce an asymmetric feature distillation method for old and new classes inspired by cognitive science, using the gradient of classification and knowledge distillation losses to determine whether to perform pattern completion or pattern separation. We also propose a method to selectively interpolate the weight of the previous model for a balance between stability and plasticity, and we adjust whether to transfer through model confidence to ensure the performance of the previous class and enable exploratory learning. We validate the effectiveness of the proposed method, which surpasses the performance of existing methods through extensive experimental protocols using CIFAR-100, ImageNet-Subset, and ImageNet-Full.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to overcome catastrophic forgetting in class - incremental learning (CIL), that is, to maintain the memory of old knowledge while learning new knowledge. Specifically, the paper proposes a selective regularization method for class - incremental learning (SRIL), aiming to balance the stability and plasticity of the model by introducing gradient - based feature distillation (GFD) and confidence - aware weight interpolation (CWI). ### Main Contributions 1. **Gradient - based Feature Distillation (GFD)** - A gradient - based feature distillation method is proposed. By calculating the cosine similarity between the gradients of the classification loss and the knowledge distillation loss, a binary mask is generated to decide whether to apply feature distillation to each channel. - For new - class data, when the gradient directions of the classification loss and the knowledge distillation loss are the same, it represents pattern completion; otherwise, it represents pattern separation. 2. **Confidence - aware Weight Interpolation (CWI)** - By interpolating the weights of the old model into the new model, the change in the loss of the old data is prevented, ensuring the stability of the model. - The interpolation parameter is dynamically adjusted according to the confidence of the new model on the old data. If the confidence of the new model on the old data exceeds a certain threshold, the regularization is removed to achieve exploratory learning, thus balancing stability and plasticity. ### Experimental Verification - **Data Sets**: Three data sets, CIFAR - 100, ImageNet - Subset and ImageNet - Full, are used for the experiments. - **Performance Comparison**: Compared with the existing state - of - the - art methods (such as iCaRL, BiC, LUCIR, PODNet, etc.), the results show that SRIL achieves better performance in multiple task settings. - **Ablation Study**: The effects of each component (GFD and CWI) are analyzed through ablation experiments, verifying the effectiveness of these components. ### Conclusion By proposing the SRIL method, this paper effectively solves the catastrophic forgetting problem in class - incremental learning and shows superior performance on multiple data sets. Through gradient - based feature distillation and confidence - aware weight interpolation, SRIL can maintain the memory of old knowledge while learning new knowledge, thus achieving a good balance between the stability and plasticity of the model.

SRIL: Selective Regularization for Class-Incremental Learning

Fine-Grained Knowledge Selection and Restoration for Non-Exemplar Class Incremental Learning

Curiosity-Driven Class-Incremental Learning Via Adaptive Sample Selection

A Robust and Anti-Forgettiable Model for Class-Incremental Learning

Multi-granularity knowledge distillation and prototype consistency regularization for class-incremental learning

Model Behavior Preserving for Class-Incremental Learning

Brain-Inspired Continual Learning-Robust Feature Distillation and Re-Consolidation for Class Incremental Learning

Class incremental learning with probability dampening and cascaded gated classifier

Adaptive knowledge transfer for class incremental learning

Class Incremental Learning with Deep Contrastive Learning and Attention Distillation

Maintaining Discrimination and Fairness in Class Incremental Learning

Large Scale Incremental Learning

Class Incremental Learning Via Multi-hinge Distillation

Class Incremental Learning with Less Forgetting Direction and Equilibrium Point

Memory-Free Generative Replay For Class-Incremental Learning

Rebalancing network with knowledge stability for class incremental learning

Brain-Inspired Continual Learning: Robust Feature Distillation and Re-Consolidation for Class Incremental Learning

Discrimination Correction and Balance for Class-Incremental Learning

Rethinking Class-Incremental Learning from a Dynamic Imbalanced Learning Perspective

Inherit With Distillation and Evolve With Contrast: Exploring Class Incremental Semantic Segmentation Without Exemplar Memory

Adaptively Integrated Knowledge Distillation and Prediction Uncertainty for Continual Learning