Abstract:Brain-inspired Spiking Neural Networks (SNNs) have the characteristics of event-driven and high energy-efficient, which are different from traditional Artificial Neural Networks (ANNs) when deployed on edge devices such as neuromorphic chips. Most previous work focuses on SNNs training strategies to improve model performance and brings larger and deeper network architectures. It is difficult to deploy these complex networks on resource-limited edge devices directly. To meet such demand, people compress SNNs very cautiously to balance the performance and the computation efficiency. Existing compression methods either iteratively pruned SNNs using weights norm magnitude or formulated the problem as a sparse learning optimization. We propose an improved end-to-end Minimax optimization method for this sparse learning problem to better balance the model performance and the computation efficiency. We also demonstrate that jointly applying compression and finetuning on SNNs is better than sequentially, especially for extreme compression ratios. The compressed SNN models achieved state-of-the-art (SOTA) performance on various benchmark datasets and architectures. Our code is available at <a class="link-external link-https" href="https://github.com/chenjallen/Resource-Constrained-Compression-on-SNN" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to efficiently compress spiking neural networks (SNNs) under resource - constrained conditions in order to balance model performance and computational efficiency. Specifically, as SNNs are increasingly widely used on edge devices, the computing resources of these devices are usually limited, and it becomes difficult to directly deploy complex SNNs models. Therefore, a method is needed to compress SNNs models so that they can operate in resource - constrained environments while maintaining high performance. The paper proposes an end - to - end model compression method based on Minimax optimization to address this problem. This method controls resource consumption by introducing learnable sparsity parameters and jointly optimizes the sparsity and weight parameters of the model through the Minimax optimization framework, thereby achieving effective model compression under resource constraints. ### Main contributions of the paper: 1. **Proposing an end - to - end Minimax optimization method**: This method successfully compresses SNNs models. By using DC sparsity reconstruction and the straight - through estimator (STE) to construct a gradient optimization algorithm, it achieves model compression under resource constraints. 2. **Transforming the resource - constrained SNNs compression problem into a constrained optimization problem**: By introducing learnable sparsity parameters, it links resource consumption with model weights, making the optimization problem solvable using gradient methods. 3. **The algorithm is easy to train and has a remarkable effect**: Experimental results on multiple public benchmark tasks show that this method can effectively compress SNNs models and achieves state - of - the - art performance at different compression ratios. ### Problems solved: - **Model compression under resource constraints**: On edge devices with limited computing resources, how to efficiently compress SNNs models so that they can run on these devices. - **Balancing performance and computational efficiency**: While compressing the model, how to maintain the performance of the model and avoid performance degradation due to excessive compression. ### Innovation points of the method: - **Minimax optimization framework**: By introducing the Minimax optimization framework, it jointly optimizes sparsity constraints and resource constraints, solving the problem of difficultly balancing performance and computational efficiency in traditional methods. - **DC sparsity reconstruction and STE**: Using DC sparsity reconstruction and the straight - through estimator (STE) to handle non - continuous sparsity constraints makes the optimization problem solvable using gradient methods. ### Experimental results: - **Performance on multiple datasets**: The paper conducted experiments on datasets such as MNIST, CIFAR10, CIFAR100, and ImageNet. The results show that this method achieves state - of - the - art performance at different compression ratios. - **Comparison with existing methods**: Compared with existing SNNs compression methods, such as ADMM - based, Deep R, Grad R, etc., this method has significant advantages in terms of sparsity and accuracy. In conclusion, the paper proposes a novel Minimax optimization method, which effectively solves the problem of SNNs model compression in resource - constrained environments and provides strong support for the practical application of SNNs on edge devices.

Resource Constrained Model Compression via Minimax Optimization for Spiking Neural Networks

MCMC: Multi-Constrained Model Compression Via One-Stage Envelope Reinforcement Learning.

Efficient Structure Slimming for Spiking Neural Networks

BitSNNs: Revisiting Energy-efficient Spiking Neural Networks

Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization

An efficient pruning and fine-tuning method for deep spiking neural network

Neural Network Compression Via Sparse Optimization

CompSNN: A lightweight spiking neural network based on spatiotemporally compressive spike features

LitE-SNN: Designing Lightweight and Efficient Spiking Neural Network through Spatial-Temporal Compressive Network Search and Joint Optimization

Efficient Neural Network Compression Inspired by Compressive Sensing.

Spike Trains Encoding and Threshold Rescaling Method for Deep Spiking Neural Networks

Towards Efficient Deep Spiking Neural Networks Construction with Spiking Activity based Pruning

An Efficient Spiking Neural Network Accelerator with Sparse Weight.

tinySNN: Towards Memory- and Energy-Efficient Spiking Neural Networks

Toward High-Accuracy and Low-Latency Spiking Neural Networks With Two-Stage Optimization

Marmotini: A Weight Density Adaptation Architecture with Hybrid Compression Method for Spiking Neural Network

Q-SNNs: Quantized Spiking Neural Networks

Evolving Efficient Genetic Encoding for Deep Spiking Neural Networks

Toward Efficient Deep Spiking Neuron Networks:A Survey On Compression

A Convolutional Spiking Neural Network Accelerator with the Sparsity-Aware Memory and Compressed Weights

Scaling Spike-driven Transformer with Efficient Spike Firing Approximation Training