Brain-inspired continual pre-trained learner via silent synaptic consolidation

Xuming Ran,Juntao Yao,Yusong Wang,Mingkun Xu,Dianbo Liu
2024-10-08
Abstract:Pre-trained models have demonstrated impressive generalization capabilities, yet they remain vulnerable to catastrophic forgetting when incrementally trained on new tasks. Existing architecture-based strategies encounter two primary challenges: 1) Integrating a pre-trained network with a trainable sub-network complicates the delicate balance between learning plasticity and memory stability across evolving tasks during learning. 2) The absence of robust interconnections between pre-trained networks and various sub-networks limits the effective retrieval of pertinent information during inference. In this study, we introduce the Artsy, inspired by the activation mechanisms of silent synapses via spike-timing-dependent plasticity observed in mature brains, to enhance the continual learning capabilities of pre-trained models. The Artsy integrates two key components: During training, the Artsy mimics mature brain dynamics by maintaining memory stability for previously learned knowledge within the pre-trained network while simultaneously promoting learning plasticity in task-specific sub-networks. During inference, artificial silent and functional synapses are utilized to establish precise connections between the pre-synaptic neurons in the pre-trained network and the post-synaptic neurons in the sub-networks, facilitated through synaptic consolidation, thereby enabling effective extraction of relevant information from test samples. Comprehensive experimental evaluations reveal that our model significantly outperforms conventional methods on class-incremental learning tasks, while also providing enhanced biological interpretability for architecture-based approaches. Moreover, we propose that the Artsy offers a promising avenue for simulating biological synaptic mechanisms, potentially advancing our understanding of neural plasticity in both artificial and biological systems.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how pre - trained models can avoid catastrophic forgetting while learning new tasks during the Continual Learning (CL) process. Specifically, the paper focuses on how to maintain the memory stability of prior knowledge when introducing new tasks and at the same time promote the learning flexibility for new tasks. This challenge is particularly prominent in existing architecture strategies, which face two main problems when integrating pre - trained networks with trainable sub - networks: 1. **Balancing learning flexibility and memory stability**: When learning in constantly changing tasks, it is necessary to enable the model to adapt to new tasks while maintaining the existing knowledge. Existing methods have difficulties in achieving this balance. 2. **Lack of effective connections between networks**: There are insufficient robust connections between pre - trained networks and different sub - networks, resulting in difficulties in effectively extracting relevant information during the inference stage. To solve these problems, the paper proposes the **Artsy framework**, which is inspired by the activation mechanism of silent synapses in the mature brain through spike - timing - dependent plasticity (STDP). The Artsy framework contains two key components: 1. **Training phase**: Simulate the dynamics of the mature brain to maintain the memory stability of the learned knowledge in the pre - trained network and at the same time promote the learning flexibility in the task - specific sub - networks. 2. **Inference phase**: Establish precise connections through artificial silent synapses and functional synapses, enabling effective information transfer between presynaptic neurons in the pre - trained network and postsynaptic neurons in the sub - networks, so as to extract relevant features from test samples. ### Specific implementation - **Pre - trained network**: Similar to the stable functional synapses in the mature brain, the pre - trained network remains fixed to maintain memory stability. - **Initialization of sub - networks**: Similar to the silent synapses located on filopodia in the mature brain, the initialized sub - networks have the ability to learn new tasks. - **Artificial synapses**: Include artificial silent synapses and functional synapses, which are used to establish connections between pre - trained networks and sub - networks. Artificial silent synapses can be transformed into functional synapses when receiving new stimuli, thus forming new connections. - **Classifier**: Use non - parametric linear classifiers (such as prototype - based classifiers) to classify new data. ### Experimental results The paper conducted extensive experimental evaluations on Class - Incremental Learning (CIL) tasks, and the results show that the Artsy framework significantly outperforms traditional methods. Specifically, Artsy performs excellently in both the Average Accuracy and the Last Accuracy on the CIFAR - 100 and TinyImageNet datasets. For example, on the CIFAR - 100 dataset, the average accuracy of Artsy is 92.44% and the last accuracy is 87.94%. ### Conclusion The Artsy framework successfully realizes the continuous learning ability in pre - trained models by drawing on the mechanism of silent synapses in biological neural networks, and at the same time provides a more biologically interpretable architecture method. This not only improves the performance of the model but also provides a new perspective for understanding the plasticity of artificial and biological nervous systems.