Abstract:Catastrophic forgetting(CF) is a significant challenge in continual learning (CL). In regularization-based approaches to mitigate CF, modifications to important training parameters are penalized in subsequent tasks using an appropriate loss function. We propose the RTRA, a modification to the widely used Elastic Weight Consolidation (EWC) regularization scheme, using the Natural Gradient for loss function optimization. Our approach improves the training of regularization-based methods without sacrificing test-data performance. We compare the proposed RTRA approach against EWC using the iFood251 dataset. We show that RTRA has a clear edge over the state-of-the-art approaches.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the Catastrophic Forgetting (CF) problem in Continual Learning (CL). Specifically, the author proposes a regularization - based method - RTRA (Rapid Training of Regularization - based Approaches), aiming to accelerate the model's retraining process by using natural gradient to optimize the loss function while maintaining the performance of test data without decline. ### Main problems: 1. **Catastrophic Forgetting**: In continual learning, when the model learns new tasks, it is easy to forget the knowledge of previous tasks, resulting in performance degradation. 2. **Training Efficiency**: Traditional regularization methods such as EWC (Elastic Weight Consolidation) are effective but slow in training, especially when dealing with a large number of tasks. ### Solutions: - **RTRA Method**: By introducing the Natural Gradient (NG) to optimize the loss function in EWC, thus accelerating the training process. - **iFood251 Dataset**: The author selects the food classification dataset iFood251 for experiments to verify the effectiveness of RTRA. This dataset contains 251 types of food pictures and is suitable for evaluating the effect of Class - Incremental Learning (CIL). ### Specific Contributions: 1. **First Application of Natural Gradient**: This is the first time that the natural gradient has been applied to the regularization method in continual learning to improve the training speed. 2. **New Benchmark Test**: A new benchmark test has been established for the iFood251 dataset, filling the research gap of this dataset in the field of class - incremental learning. ### Experimental Results: - RTRA reduces the training time by 7.71% compared with EWC without sacrificing accuracy. - Under different task scales (for example, task sizes of 25, 30 and 35 classes), RTRA shows better performance. In conclusion, the main goal of this paper is to improve the existing regularization methods by introducing natural gradients, so as to solve the Catastrophic Forgetting problem in continual learning more efficiently, and its effectiveness has been verified in practical applications.

RTRA: Rapid Training of Regularization-based Approaches in Continual Learning

Progressive Learning without Forgetting

Self-paced Weight Consolidation for Continual Learning

Continual Learning in Human Activity Recognition: an Empirical Analysis of Regularization

Slowing Down Forgetting in Continual Learning

Regularization Shortcomings for Continual Learning

IMEX-Reg: Implicit-Explicit Regularization in the Function Space for Continual Learning

Can Continual Learning Improve Long-Tailed Recognition? Toward a Unified Framework

Overcoming Catastrophic Forgetting in Continual Learning by Exploring Eigenvalues of Hessian Matrix.

On the Convergence of Continual Learning with Adaptive Methods

Mitigating Catastrophic Forgetting in Task-Incremental Continual Learning with Adaptive Classification Criterion

Continual Learning with Recursive Gradient Optimization

TRGP: Trust Region Gradient Projection for Continual Learning

Learning After Learning: Positive Backward Transfer in Continual Learning

Realistic Continual Learning Approach using Pre-trained Models

Addressing Loss of Plasticity and Catastrophic Forgetting in Continual Learning

Defeating Catastrophic Forgetting via Enhanced Orthogonal Weights Modification

EVCL: Elastic Variational Continual Learning with Weight Consolidation

Achieving Forgetting Prevention and Knowledge Transfer in Continual Learning

The Ideal Continual Learner: An Agent That Never Forgets

Towards continuous learning for glioma segmentation with elastic weight consolidation