Abstract:Recently several structured pruning techniques have been introduced for energy-efficient implementation of Deep Neural Networks (DNNs) with lesser number of crossbars. Although, these techniques have claimed to preserve the accuracy of the sparse DNNs on crossbars, none have studied the impact of the inexorable crossbar non-idealities on the actual performance of the pruned networks. To this end, we perform a comprehensive study to show how highly sparse DNNs, that result in significant crossbar-compression-rate, can lead to severe accuracy losses compared to unpruned DNNs mapped onto non-ideal crossbars. We perform experiments with multiple structured-pruning approaches (such as, C/F pruning, XCS and XRS) on VGG11 and VGG16 DNNs with benchmark datasets (CIFAR10 and CIFAR100). We propose two mitigation approaches - Crossbar column rearrangement and Weight-Constrained-Training (WCT) - that can be integrated with the crossbar-mapping of the sparse DNNs to minimize accuracy losses incurred by the pruned models. These help in mitigating non-idealities by increasing the proportion of low conductance synapses on crossbars, thereby improving their computational accuracies.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: **Research and mitigate the impact of crossbar non - idealities on the hardware implementation of sparse deep neural networks (DNNs)**. Specifically, the paper focuses on the following issues: 1. **Performance degradation of highly sparse DNNs on non - ideal crossbars**: Although structured pruning techniques can achieve more efficient DNNs on hardware, these techniques do not fully consider the non - idealities of crossbars (such as interconnect parasitic effects, synaptic nonlinearity and variation, etc.) in practical applications, resulting in a significant decrease in the inference accuracy of sparse DNNs when implemented on non - ideal crossbars. 2. **Trade - off between resource efficiency and performance**: As the structured sparsity of DNNs increases, although the hardware resource efficiency (area and energy) improves, the performance (inference accuracy) is sacrificed. To address these problems, the paper has carried out the following work: - **Experimental verification**: Experiments were carried out on VGG11 and VGG16 DNNs through multiple structured pruning methods (such as C/F pruning, XCS and XRS), using the CIFAR10 and CIFAR100 datasets, demonstrating the performance degradation of highly sparse DNNs on non - ideal crossbars. - **Proposing mitigation strategies**: Two hardware - friendly non - ideality mitigation strategies were proposed: - **Crossbar - column rearrangement**: By rearranging the columns of the weight matrix, the proportion of low - conductance synapses is increased, thereby reducing the impact of non - idealities. - **Weight - Constrained - Training (WCT)**: Training the structured - pruned DNN on software, restricting the weight range so that more weights are in a low - conductance state, thereby reducing the impact of non - idealities. Through these methods, the paper aims to improve the performance of sparse DNNs on non - ideal crossbars and provide references for future research.

Examining and Mitigating the Impact of Crossbar Non-idealities for Accurate Implementation of Sparse Deep Neural Networks

Learning to Slim Deep Networks with Bandit Channel Pruning

Crossbar-aware neural network pruning

Structured Pruning for Efficient Convolutional Neural Networks Via Incremental Regularization

Class-Aware Pruning for Efficient Neural Networks

Structured Deep Neural Network Pruning by Varying Regularization Parameters.

Pruning for Improved ADC Efficiency in Crossbar-based Analog In-memory Accelerators

A Pruning Method Based on the Dissimilarity of Angle among Channels and Filters

Adversarial Structured Neural Network Pruning

Technology Aware Training in Memristive Neuromorphic Systems based on non-ideal Synaptic Crossbars

CRISP: Hybrid Structured Sparsity for Class-aware Model Pruning

PruneAug: Bridging DNN Pruning and Inference Latency on Diverse Sparse Platforms Using Automatic Layerwise Block Pruning

Cross-layer importance evaluation for neural network pruning

Examining the Robustness of Spiking Neural Networks on Non-ideal Memristive Crossbars

Intermediate-grained kernel elements pruning with structured sparsity

Exploring Compute-in-Memory Architecture Granularity for Structured Pruning of Neural Networks

Tiny but Accurate: A Pruned, Quantized and Optimized Memristor Crossbar Framework for Ultra Efficient DNN Implementation

Accurate Neural Network Pruning Requires Rethinking Sparse Optimization

Examining the Role and Limits of Batchnorm Optimization to Mitigate Diverse Hardware-noise in In-memory Computing

Investigating the Effect of Network Pruning on Performance and Interpretability

BinSparX: Sparsified Binary Neural Networks for Reduced Hardware Non-Idealities in Xbar Arrays