Abstract:As well known, the huge memory and compute costs of both artificial neural networks (ANNs) and spiking neural networks (SNNs) greatly hinder their deployment on edge devices with high efficiency. Model compression has been proposed as a promising technique to improve the running efficiency via parameter and operation reduction, whereas this technique is mainly practiced in ANNs rather than SNNs. It is interesting to answer how much an SNN model can be compressed without compromising its functionality, where two challenges should be addressed: 1) the accuracy of SNNs is usually sensitive to model compression, which requires an accurate compression methodology and 2) the computation of SNNs is event-driven rather than static, which produces an extra compression dimension on dynamic spikes. To this end, we realize a comprehensive SNN compression through three steps. First, we formulate the connection pruning and weight quantization as a constrained optimization problem. Second, we combine spatiotemporal backpropagation (STBP) and alternating direction method of multipliers (ADMMs) to solve the problem with minimum accuracy loss. Third, we further propose activity regularization to reduce the spike events for fewer active operations. These methods can be applied in either a single way for moderate compression or a joint way for aggressive compression. We define several quantitative metrics to evaluate the compression performance for SNNs. Our methodology is validated in pattern recognition tasks over MNIST, N-MNIST, CIFAR10, and CIFAR100 datasets, where extensive comparisons, analyses, and insights are provided. To the best of our knowledge, this is the first work that studies SNN compression in a comprehensive manner by exploiting all compressible components and achieves better results.

Automl For Densenet Compression

Deep Neural Network Acceleration with Sparse Prediction Layers

MCMC: Multi-Constrained Model Compression Via One-Stage Envelope Reinforcement Learning.

Efficient Network Compression Through Smooth-Lasso Constraint

A Pruning Method Based on the Dissimilarity of Angle among Channels and Filters

Learning Efficient Convolutional Networks Through Network Slimming.

SparseNet: A Sparse DenseNet for Image Classification

Connection Reduction of DenseNet for Image Recognition

On Compressing Deep Models by Low Rank and Sparse Decomposition.

LDCNet: A Lightweight Multi-Scale Convolutional Neural Network Using Local Dense Connectivity for Image Recognition

Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization

Efficient Neural Network Compression Inspired by Compressive Sensing.

AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates

DRF-DRC: dynamic receptive field and dense residual connections for model compression

Layerwise Sparse Coding for Pruned Deep Neural Networks with Extreme Compression Ratio.

Neural Network Compression by Joint Sparsity Promotion and Redundancy Reduction

An efficient pruning and fine-tuning method for deep spiking neural network

Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates using ADMM

Neural Epitome Search for Architecture-Agnostic Network Compression

Neural Network Compression Via Sparse Optimization

ResNet Can Be Pruned 60x: Introducing Network Purification and Unused Path Removal (P-RM) after Weight Pruning