SIESTA: Efficient Online Continual Learning with Sleep

Md Yousuf Harun,Jhair Gallardo,Tyler L. Hayes,Ronald Kemker,Christopher Kanan
2023-11-02
Abstract:In supervised continual learning, a deep neural network (DNN) is updated with an ever-growing data stream. Unlike the offline setting where data is shuffled, we cannot make any distributional assumptions about the data stream. Ideally, only one pass through the dataset is needed for computational efficiency. However, existing methods are inadequate and make many assumptions that cannot be made for real-world applications, while simultaneously failing to improve computational efficiency. In this paper, we propose a novel continual learning method, SIESTA based on wake/sleep framework for training, which is well aligned to the needs of on-device learning. The major goal of SIESTA is to advance compute efficient continual learning so that DNNs can be updated efficiently using far less time and energy. The principal innovations of SIESTA are: 1) rapid online updates using a rehearsal-free, backpropagation-free, and data-driven network update rule during its wake phase, and 2) expedited memory consolidation using a compute-restricted rehearsal policy during its sleep phase. For memory efficiency, SIESTA adapts latent rehearsal using memory indexing from REMIND. Compared to REMIND and prior arts, SIESTA is far more computationally efficient, enabling continual learning on ImageNet-1K in under 2 hours on a single GPU; moreover, in the augmentation-free setting it matches the performance of the offline learner, a milestone critical to driving adoption of continual learning in real-world applications.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to enable deep neural networks (DNNs) to efficiently learn and update from the ever - growing data streams in supervised continuous learning while maintaining low computational cost and memory consumption. Specifically, the paper focuses on how to achieve efficient online learning with only one pass through the dataset without making assumptions about the data distribution. Existing methods either cannot meet the requirements in practical applications or fail to significantly improve computational efficiency. Therefore, the paper proposes a new continuous learning method - SIESTA, aiming to optimize computational efficiency by introducing the "wake/sleep" framework, so that DNNs can be effectively updated with less time and energy. ### Main contributions: 1. **Framework and algorithm**: The paper proposes a framework that combines online update and offline memory consolidation and describes in detail the SIESTA algorithm operating under this framework. SIESTA can perform rapid online learning and inference in the "wake" phase and offline memory consolidation in the "sleep" phase. 2. **Performance improvement**: For the incremental category learning tasks on ImageNet - 1K, SIESTA achieves state - of - the - art performance with fewer parameters, memory, and computational resources. When not using data augmentation, it only takes 1.9 hours to train SIESTA on a single NVIDIA A5000 GPU. In contrast, other recent methods require several orders of magnitude more computational resources. 3. **Performance comparable to offline models**: SIESTA is the first continuous learning algorithm whose performance is exactly the same as that of offline models without using data augmentation. It does not suffer from catastrophic forgetting in the no - data - augmentation setting and can handle data in any order, achieving similar performance in both the class - incremental setting and the independent and identically distributed (iid) setting. ### Key points of the solution: - **Online learning and lightweight update**: In the "wake" phase, SIESTA only updates the output layer of the DNN, using a data - driven network update rule without replay and back - propagation, which helps to avoid catastrophic forgetting and allows for lightweight online updates. - **Offline memory consolidation**: In the "sleep" phase, SIESTA adopts a computationally - constrained replay strategy for memory consolidation, further improving computational efficiency. - **Memory efficiency**: SIESTA adopts the quantized latent replay scheme from REMIND and improves memory efficiency through memory indexing technology, being able to store more samples with a limited memory budget. ### Conclusion: By proposing the SIESTA algorithm, the paper not only solves the limitations of existing continuous learning methods in computational efficiency and practical applications but also provides new possibilities for achieving efficient learning and inference on the device side. The success of SIESTA shows the potential of the framework combining online learning and offline memory consolidation in improving continuous learning performance, opening up new directions for future research and practical applications.