Budgeted Online Continual Learning by Adaptive Layer Freezing and Frequency-based Sampling

Minhyuk Seo,Hyunseo Koh,Jonghyun Choi
2024-10-20
Abstract:The majority of online continual learning (CL) advocates single-epoch training and imposes restrictions on the size of replay memory. However, single-epoch training would incur a different amount of computations per CL algorithm, and the additional storage cost to store logit or model in addition to replay memory is largely ignored in calculating the storage budget. Arguing different computational and storage budgets hinder fair comparison among CL algorithms in practice, we propose to use floating point operations (FLOPs) and total memory size in Byte as a metric for computational and memory budgets, respectively, to compare and develop CL algorithms in the same 'total resource budget.' To improve a CL method in a limited total budget, we propose adaptive layer freezing that does not update the layers for less informative batches to reduce computational costs with a negligible loss of accuracy. In addition, we propose a memory retrieval method that allows the model to learn the same amount of knowledge as using random retrieval in fewer iterations. Empirical validations on the CIFAR-10/100, CLEAR-10/100, and ImageNet-1K datasets demonstrate that the proposed approach outperforms the state-of-the-art methods within the same total budget
Machine Learning,Artificial Intelligence,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the problem of computational and storage resource limitations in online continual learning (CL). Specifically: 1. **Differences in computational budgets**: Most online CL methods advocate single - epoch training, but this will lead to different actual computational amounts of different algorithms, thus affecting fairness. 2. **Neglect of storage costs**: In addition to the replay memory, some methods also need additional storage to save model parameters or logits (unnormalized model output vectors), but these additional storage costs are usually neglected. 3. **Lack of a unified resource budget standard**: There is a lack of a unified standard in current research to measure the computational and storage resource consumption of different CL algorithms, making it difficult to make a fair comparison. To solve these problems, the paper makes the following contributions: ### Main contributions 1. **Unified resource budget standard**: - Use the number of floating - point operations (FLOPs) as a metric for computational budgets. - Use the total memory size (in bytes) as a metric for storage budgets. - In this way, ensure that all CL algorithms are compared and developed under the same "total resource budget". 2. **Adaptive layer freezing**: - Propose an adaptive layer - freezing strategy, selectively freezing those layers that are less informative for the current mini - batch data, in order to reduce computational costs while maintaining a high accuracy. - Specifically, determine the optimal number of frozen layers by maximizing the Fisher Information (FI) obtained in each batch of training. 3. **Frequency - based sample retrieval**: - Propose a new sample retrieval method, preferentially selecting those samples that the model has not fully learned for training according to the historical usage frequency of samples and the inter - class gradient similarity. - This method can improve the learning efficiency of the model without increasing additional computational costs. ### Experimental verification The paper conducts experimental verification on multiple datasets such as CIFAR - 10/100, CLEAR - 10/100 and ImageNet - 1K. The results show that under the same total resource budget, the proposed method significantly outperforms the existing state - of - the - art methods. Through these improvements, the paper aims to provide a more fair and effective online continual learning framework to deal with resource limitation problems in practical applications.