Abstract:Continual learning (CL) aims at studying how to learn new knowledge continuously from data streams without catastrophically forgetting the previous knowledge. One of the key problems is catastrophic forgetting, that is, the performance of the model on previous tasks declines significantly after learning the subsequent task. Several studies addressed it by replaying samples stored in the buffer when training new tasks. However, the data imbalance between old and new task samples results in two serious problems: information suppression and weak feature discriminability. The former refers to the information in the sufficient new task samples suppressing that in the old task samples, which is harmful to maintaining the knowledge since the biased output worsens the consistency of the same sample's output at different moments. The latter refers to the feature representation being biased to the new task, which lacks discrimination to distinguish both old and new tasks. To this end, we build an imbalance mitigation for CL (IMCL) framework that incorporates a decoupled knowledge distillation (DKD) approach and a dual enhanced contrastive learning (DECL) approach to tackle both problems. Specifically, the DKD approach alleviates the suppression of the new task on the old tasks by decoupling the model output probability during the replay stage, which better maintains the knowledge of old tasks. The DECL approach enhances both low-and high-level features and fuses the enhanced features to construct contrastive loss to effectively distinguish different tasks. Extensive experiments on three popular datasets show that our method achieves promising performance under task incremental learning (Task-IL), class incremental learning (Class-IL), and domain incremental learning (Domain-IL) settings.

On the Convergence of Continual Learning with Adaptive Methods

Progressive Learning without Forgetting

Adaptive Progressive Continual Learning.

Mitigating Catastrophic Forgetting in Task-Incremental Continual Learning with Adaptive Classification Criterion

Self-paced Weight Consolidation for Continual Learning

Adaptive Plasticity Improvement for Continual Learning

Task Agnostic Continual Learning via Meta Learning

Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning

Elastic Multi-Gradient Descent for Parallel Continual Learning

Learning to Continually Learn with the Bayesian Principle

Optimizing Spca-based Continual Learning: A Theoretical Approach.

AdaptCL: Adaptive Continual Learning for Tackling Heterogeneity in Sequential Datasets

Similarity-Based Adaptation for Task-Aware and Task-Free Continual Learning

Maintaining Plasticity in Deep Continual Learning

Balancing Stability and Plasticity Through Advanced Null Space in Continual Learning

Layerwise Optimization by Gradient Decomposition for Continual Learning

An Effective Dynamic Gradient Calibration Method for Continual Learning

Efficient Meta-Learning for Continual Learning with Taylor Expansion Approximation

Challenging Common Assumptions about Catastrophic Forgetting

Imbalance Mitigation for Continual Learning via Knowledge Decoupling and Dual Enhanced Contrastive Learning

Recall-Oriented Continual Learning with Generative Adversarial Meta-Model