Abstract:Continual learning, or the ability to progressively integrate new concepts, is fundamental to intelligent beings, enabling adaptability in dynamic environments. In contrast, artificial deep neural networks face the challenge of catastrophic forgetting when learning new tasks sequentially. To alleviate the problem of forgetting, recent approaches aim to preserve essential weight subspaces for previous tasks by limiting updates to orthogonal subspaces via gradient projection. While effective, this approach can lead to suboptimal performance, particularly when tasks are highly correlated. In this work, we introduce COnceptor-based gradient projection for DEep Continual Learning (CODE-CL), a novel method that leverages conceptor matrix representations, a computational model inspired by neuroscience, to more flexibly handle highly correlated tasks. CODE-CL encodes directional importance within the input space of past tasks, allowing new knowledge integration in directions modulated by $1-S$, where $S$ represents the direction's relevance for prior tasks. Additionally, we analyze task overlap using conceptor-based representations to identify highly correlated tasks, facilitating efficient forward knowledge transfer through scaled projection within their intersecting subspace. This strategy enhances flexibility, allowing learning in correlated tasks without significantly disrupting previous knowledge. Extensive experiments on continual learning image classification benchmarks validate CODE-CL's efficacy, showcasing superior performance with minimal forgetting, outperforming most state-of-the-art methods.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of catastrophic forgetting in deep continual learning (CL). Specifically, when deep neural networks sequentially learn new tasks, they often forget the knowledge learned previously, which severely limits the model's adaptability. #### Background of the Catastrophic Forgetting Problem - **Characteristics of Agents**: Continual learning is an important characteristic of agents, enabling them to continuously integrate new concepts in a dynamic environment. - **Challenges of Artificial Neural Networks**: Unlike biological agents, artificial deep neural networks are prone to catastrophic forgetting when sequentially learning new tasks, that is, the knowledge learned previously will degrade significantly when learning new tasks. #### Limitations of Current Methods To alleviate catastrophic forgetting, existing methods are mainly divided into three categories: 1. **Regularization Methods**: Protect the key features of previous tasks by constraining the update of important parameters. 2. **Expansion Methods**: Dynamically allocate new network resources to accommodate the increasing complexity of sequential tasks. 3. **Memory Methods**: Store representative samples or features from previous tasks to maintain performance on the early data distribution. Although these methods have certain effects, they usually need to make a trade - off between flexibility and retention, and many methods rely on a large amount of memory storage or allocate dedicated resources for each new task. #### The New Method Proposed in the Paper In this paper, the authors propose COnceptor - based gradient projection for DEep Continual Learning (CODE - CL), a new method using conceptor matrix representations. The main contributions of this method include: 1. **Introduction of the Concept Matrix**: Represent the task subspace through the concept matrix, encode the directional importance of past tasks in the input space, and thus more flexibly handle highly correlated tasks. 2. **Task Overlap Analysis**: Identify highly correlated tasks by calculating the intersection subspaces between tasks, promoting effective forward knowledge transfer. 3. **Gradient Projection Strategy**: For related tasks, perform scaled projection within the intersection subspace; for unrelated tasks, limit the update to avoid interfering with previous knowledge. #### Experimental Verification The authors conducted extensive experiments on multiple continual learning image classification benchmarks (such as Permuted MNIST, Split CIFAR100, Split miniImageNet, and 5 - Datasets), and the results show that CODE - CL achieves superior performance while maintaining minimal forgetting, outperforming most of the existing state - of - the - art methods. ### Summary This paper provides a scalable and efficient continual learning method by introducing CODE - CL, which can flexibly acquire new information while retaining past knowledge, and solves the problem of catastrophic forgetting in deep neural networks during continual learning.

CODE-CL: COnceptor-Based Gradient Projection for DEep Continual Learning

UniGrad-FS: Unified Gradient Projection with Flatter Sharpness for Continual Learning

Progressive Learning without Forgetting

Gradient Projection Memory for Continual Learning

Gradient Correlation Subspace Learning against Catastrophic Forgetting

On the Convergence of Continual Learning with Adaptive Methods

Continual Learning with Dependency Preserving Hypernetworks

Beyond Not-Forgetting: Continual Learning with Backward Knowledge Transfer

Bio-inspired, task-free continual learning through activity regularization

Restricted Orthogonal Gradient Projection for Continual Learning

Towards Continual Learning Desiderata via HSIC-Bottleneck Orthogonalization and Equiangular Embedding

Slowing Down Forgetting in Continual Learning

Gradient Regularized Contrastive Learning for Continual Domain Adaptation

Orthogonal Gradient Descent for Continual Learning

SynapNet: A Complementary Learning System Inspired Algorithm With Real-Time Application in Multimodal Perception

Challenging Common Assumptions about Catastrophic Forgetting

RanPAC: Random Projections and Pre-trained Models for Continual Learning

Imbalance Mitigation for Continual Learning via Knowledge Decoupling and Dual Enhanced Contrastive Learning

Adaptive online continual multi-view learning

CoSCL: Cooperation of Small Continual Learners is Stronger than a Big One

Mitigating Catastrophic Forgetting in Task-Incremental Continual Learning with Adaptive Classification Criterion