Toward High-Accuracy and Low-Latency Spiking Neural Networks With Two-Stage Optimization

Ziming Wang,Yuhao Zhang,Shuang Lian,Xiaoxin Cui,Rui Yan,Huajin Tang
DOI: https://doi.org/10.1109/TNNLS.2023.3337176
2023-12-15
Abstract:Spiking neural networks (SNNs) operating with asynchronous discrete events show higher energy efficiency with sparse computation. A popular approach for implementing deep SNNs is artificial neural network (ANN)-SNN conversion combining both efficient training of ANNs and efficient inference of SNNs. However, the accuracy loss is usually nonnegligible, especially under few time steps, which restricts the applications of SNN on latency-sensitive edge devices greatly. In this article, we first identify that such performance degradation stems from the misrepresentation of the negative or overflow residual membrane potential in SNNs. Inspired by this, we decompose the conversion error into three parts: quantization error, clipping error, and residual membrane potential representation error. With such insights, we propose a two-stage conversion algorithm to minimize those errors, respectively. In addition, we show that each stage achieves significant performance gains in a complementary manner. By evaluating on challenging datasets including CIFAR-10, CIFAR-100, and ImageNet, the proposed method demonstrates the state-of-the-art performance in terms of accuracy, latency, and energy preservation. Furthermore, our method is evaluated using a more challenging object detection task, revealing notable gains in regression performance under ultralow latency, when compared with existing spike-based detection algorithms. Codes will be available at: https://github.com/Windere/snn-cvt-dual-phase.
What problem does this paper attempt to address?