DRIVE: Dual Gradient-Based Rapid Iterative Pruning

Dhananjay Saikumar,Blesson Varghese

2024-04-02

Abstract:Modern deep neural networks (DNNs) consist of millions of parameters, necessitating high-performance computing during training and inference. Pruning is one solution that significantly reduces the space and time complexities of DNNs. Traditional pruning methods that are applied post-training focus on streamlining inference, but there are recent efforts to leverage sparsity early on by pruning before training. Pruning methods, such as iterative magnitude-based pruning (IMP) achieve up to a 90% parameter reduction while retaining accuracy comparable to the original model. However, this leads to impractical runtime as it relies on multiple train-prune-reset cycles to identify and eliminate redundant parameters. In contrast, training agnostic early pruning methods, such as SNIP and SynFlow offer fast pruning but fall short of the accuracy achieved by IMP at high sparsities. To bridge this gap, we present Dual Gradient-Based Rapid Iterative Pruning (DRIVE), which leverages dense training for initial epochs to counteract the randomness inherent at the initialization. Subsequently, it employs a unique dual gradient-based metric for parameter ranking. It has been experimentally demonstrated for VGG and ResNet architectures on CIFAR-10/100 and Tiny ImageNet, and ResNet on ImageNet that DRIVE consistently has superior performance over other training-agnostic early pruning methods in accuracy. Notably, DRIVE is 43$\times$ to 869$\times$ faster than IMP for pruning.

Machine Learning,Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to achieve efficient and high - performance parameter pruning in deep neural networks (DNNs). Modern DNNs contain millions of parameters, which require high - performance computing resources during training and inference. Traditional pruning methods are usually applied after training, aiming to reduce the space and time complexity during inference. However, these methods rely on multiple training - pruning - resetting cycles to identify and eliminate redundant parameters, resulting in impractical running times. On the other hand, early pruning methods such as SNIP and SynFlow, although providing fast pruning, cannot achieve the same accuracy as iterative magnitude pruning (IMP) at high sparsity levels. To bridge this gap, the paper proposes Dual Gradient - Based Rapid Iterative Pruning (DRIVE), a new pruning method. DRIVE combats the randomness in the initialization stage by performing intensive training within the first few epochs, and then ranks the parameters using a unique dual - gradient metric. The method has been experimentally verified on multiple architectures such as VGG and ResNet, and the results show that DRIVE outperforms other training - independent early pruning methods in terms of accuracy and is 43 to 869 times faster than IMP. Specifically, the main contributions of DRIVE include: 1. Development of DRIVE: a new pruning method that combines the advantages of initialization - based and comprehensive training - dependent pruning methods. 2. Efficient early pruning: Compared with comprehensive methods such as IMP, DRIVE only requires a small number of epochs of training to complete pruning. 3. Novel dual - gradient metric: This metric takes into account the magnitude of parameters, connection sensitivity, and convergence sensitivity to ensure optimal pruning decisions. Through these improvements, DRIVE can improve the performance of the pruned network while maintaining efficiency.

DRIVE: Dual Gradient-Based Rapid Iterative Pruning

Structured Probabilistic Pruning for Convolutional Neural Network Acceleration.

A Feature-map Discriminant Perspective for Pruning Deep Neural Networks

Class-Aware Pruning for Efficient Neural Networks

Pruning by Training: A Novel Deep Neural Network Compression Framework for Image Processing.

PDP: Parameter-free Differentiable Pruning is All You Need

DMPP: Differentiable Multi-Pruner and Predictor for Neural Network Pruning

L2PF -- Learning to Prune Faster

Global Sparse Momentum SGD for Pruning Very Deep Neural Networks

Accelerating CNN Training by Pruning Activation Gradients

Pruning Filters while Training for Efficiently Optimizing Deep Learning Networks

Global balanced iterative pruning for efficient convolutional neural networks

Knapsack Pruning with Inner Distillation

CGaP: Continuous Growth and Pruning for Efficient Deep Learning

A Dynamic Pruning Method on Multiple Sparse Structures in Deep Neural Networks

FGGP: Fixed-Rate Gradient-First Gradual Pruning

Neural Network Pruning by Gradient Descent

Multi-Dimensional Dynamic Pruning: Exploring Spatial and Channel Fuzzy Sparsity

One-Cycle Pruning: Pruning ConvNets Under a Tight Training Budget

DPACS: Hardware Accelerated Dynamic Neural Network Pruning Through Algorithm-Architecture Co-design.

GenExp: Multi-objective pruning for deep neural network based on genetic algorithm