Spiking Transfer Learning From RGB Image to Neuromorphic Event Stream

Qiugang Zhan,Guisong Liu,Xiurui Xie,Ran Tao,Malu Zhang,Huajin Tang
DOI: https://doi.org/10.1109/TIP.2024.3430043
Abstract:Recent advances in bio-inspired vision with event cameras and associated spiking neural networks (SNNs) have provided promising solutions for low-power consumption neuromorphic tasks. However, as the research of event cameras is still in its infancy, the amount of labeled event stream data is much less than that of the RGB database. The traditional method of converting static images into event streams by simulation to increase the sample size cannot simulate the characteristics of event cameras such as high temporal resolution. To take advantage of both the rich knowledge in labeled RGB images and the features of the event camera, we propose a transfer learning method from the RGB to the event domain in this paper. Specifically, we first introduce a transfer learning framework named R2ETL (RGB to Event Transfer Learning), including a novel encoding alignment module and a feature alignment module. Then, we introduce the temporal centered kernel alignment (TCKA) loss function to improve the efficiency of transfer learning. It aligns the distribution of temporal neuron states by adding a temporal learning constraint. Finally, we theoretically analyze the amount of data required by the deep neuromorphic model to prove the necessity of our method. Numerous experiments demonstrate that our proposed framework outperforms the state-of-the-art SNN and artificial neural network (ANN) models trained on event streams, including N-MNIST, CIFAR10-DVS and N-Caltech101. This indicates that the R2ETL framework is able to leverage the knowledge of labeled RGB images to help the training of SNN on event streams.
What problem does this paper attempt to address?