RTRA: Rapid Training of Regularization-based Approaches in Continual Learning

Sahil Nokhwal,Nirman Kumar
2023-12-15
Abstract:Catastrophic forgetting(CF) is a significant challenge in continual learning (CL). In regularization-based approaches to mitigate CF, modifications to important training parameters are penalized in subsequent tasks using an appropriate loss function. We propose the RTRA, a modification to the widely used Elastic Weight Consolidation (EWC) regularization scheme, using the Natural Gradient for loss function optimization. Our approach improves the training of regularization-based methods without sacrificing test-data performance. We compare the proposed RTRA approach against EWC using the iFood251 dataset. We show that RTRA has a clear edge over the state-of-the-art approaches.
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the Catastrophic Forgetting (CF) problem in Continual Learning (CL). Specifically, the author proposes a regularization - based method - RTRA (Rapid Training of Regularization - based Approaches), aiming to accelerate the model's retraining process by using natural gradient to optimize the loss function while maintaining the performance of test data without decline. ### Main problems: 1. **Catastrophic Forgetting**: In continual learning, when the model learns new tasks, it is easy to forget the knowledge of previous tasks, resulting in performance degradation. 2. **Training Efficiency**: Traditional regularization methods such as EWC (Elastic Weight Consolidation) are effective but slow in training, especially when dealing with a large number of tasks. ### Solutions: - **RTRA Method**: By introducing the Natural Gradient (NG) to optimize the loss function in EWC, thus accelerating the training process. - **iFood251 Dataset**: The author selects the food classification dataset iFood251 for experiments to verify the effectiveness of RTRA. This dataset contains 251 types of food pictures and is suitable for evaluating the effect of Class - Incremental Learning (CIL). ### Specific Contributions: 1. **First Application of Natural Gradient**: This is the first time that the natural gradient has been applied to the regularization method in continual learning to improve the training speed. 2. **New Benchmark Test**: A new benchmark test has been established for the iFood251 dataset, filling the research gap of this dataset in the field of class - incremental learning. ### Experimental Results: - RTRA reduces the training time by 7.71% compared with EWC without sacrificing accuracy. - Under different task scales (for example, task sizes of 25, 30 and 35 classes), RTRA shows better performance. In conclusion, the main goal of this paper is to improve the existing regularization methods by introducing natural gradients, so as to solve the Catastrophic Forgetting problem in continual learning more efficiently, and its effectiveness has been verified in practical applications.