Abstract:Advancements in adapting deep convolution architectures for spiking neural networks (SNNs) have significantly enhanced image classification performance and reduced computational burdens. However, the inability of multiplication-free inference (MFI) to align with attention and transformer mechanisms, which are critical to superior performance on high-resolution vision tasks, imposes limitations on these gains. To address this, our research explores a new pathway, drawing inspiration from the progress made in multilayer perceptrons (MLPs). We propose an innovative spiking MLP architecture that uses batch normalization (BN) to retain MFI compatibility and introduce a spiking patch encoding (SPE) layer to enhance local feature extraction capabilities. As a result, we establish an efficient multistage spiking MLP network that blends effectively global receptive fields with local feature extraction for comprehensive spike-based computation. Without relying on pretraining or sophisticated SNN training techniques, our network secures a top-one accuracy of 66.39% on the ImageNet-1K dataset, surpassing the directly trained spiking ResNet-34 by 2.67%. Furthermore, we curtail computational costs, model parameters, and simulation steps. An expanded version of our network compares with the performance of the spiking VGG-16 network with a 71.64% top-one accuracy, all while operating with a model capacity 2.1 times smaller. Our findings highlight the potential of our deep SNN architecture in effectively integrating global and local learning abilities. Interestingly, the trained receptive field in our network mirrors the activity patterns of cortical cells.

Efficient Deep Spiking Multilayer Perceptrons With Multiplication-Free Inference