Abstract:The ability to sequentially learn multiple tasks without forgetting is a key skill of biological brains, whereas it represents a major challenge to the field of deep learning. To avoid catastrophic forgetting, various continual learning (CL) approaches have been devised. However, these usually require discrete task boundaries. This requirement seems biologically implausible and often limits the application of CL methods in the real world where tasks are not always well defined. Here, we take inspiration from neuroscience, where sparse, non-overlapping neuronal representations have been suggested to prevent catastrophic forgetting. As in the brain, we argue that these sparse representations should be chosen on the basis of feed forward (stimulus-specific) as well as top-down (context-specific) information. To implement such selective sparsity, we use a bio-plausible form of hierarchical credit assignment known as Deep Feedback Control (DFC) and combine it with a winner-take-all sparsity mechanism. In addition to sparsity, we introduce lateral recurrent connections within each layer to further protect previously learned representations. We evaluate the new sparse-recurrent version of DFC on the split-MNIST computer vision benchmark and show that only the combination of sparsity and intra-layer recurrent connections improves CL performance with respect to standard backpropagation. Our method achieves similar performance to well-known CL methods, such as Elastic Weight Consolidation and Synaptic Intelligence, without requiring information about task boundaries. Overall, we showcase the idea of adopting computational principles from the brain to derive new, task-free learning algorithms for CL.

Efficient Weight-Space Laplace-Gaussian Filtering and Smoothing for Sequential Deep Learning

Progressive Learning without Forgetting

Function-space Parameterization of Neural Networks for Sequential Learning

Continual Learning via Sequential Function-Space Variational Inference

Modelling continual learning in humans with Hebbian context gating and exponentially decaying task signals

On Sequential Bayesian Inference for Continual Learning

Efficient Bayesian Updates for Deep Learning via Laplace Approximations

Bio-inspired, task-free continual learning through activity regularization

Layerwise Optimization by Gradient Decomposition for Continual Learning

Deep Filtering With Adaptive Learning Rates

Efficient Learning Algorithms for Gaussian Processes

Disentangling and Mitigating the Impact of Task Similarity for Continual Learning

Accelerated Linearized Laplace Approximation for Bayesian Deep Learning

Task Agnostic Continual Learning via Meta Learning

Sparsity and Heterogeneous Dropout for Continual Learning in the Null Space of Neural Activations

Orthogonal Gradient Descent for Continual Learning

CODE-CL: COnceptor-Based Gradient Projection for DEep Continual Learning

Adaptive Progressive Continual Learning.

Towards Continual Learning Desiderata via HSIC-Bottleneck Orthogonalization and Equiangular Embedding

Modeling Latent Neural Dynamics with Gaussian Process Switching Linear Dynamical Systems

Continuous Learning in a Single-Incremental-Task Scenario with Spike Features