Abstract:The deployment of Deep Neural Network (DNN)-based networks on resource-constrained devices remains a significant challenge due to their high computational and parameter requirements. To solve this problem, layer pruning has emerged as a potent approach to reduce network size and improve computational efficiency. However, existing layer pruning methods mostly overlook the intrinsic connections and inter-dependencies between different layers within complicated deep neural networks. This oversight can result in pruned models that do not preserve the essential characteristics of the pre-trained network as effectively as desired. To address this limitations, we propose a Similarity Guided fast Layer Partition pruning for compressing large deep models (SGLP), which focuses on pruning layers from network segments partitioned via representation similarity. Specifically, our presented method first leverages Centered Kernel Alignment (CKA) to indicate the internal representations among the layers of the pre-trained network, which provides us with a potent basis for layer pruning. Based on similarity matrix derived from CKA, we employ Fisher Optimal Segmentation to partition the network into multiple segments, which provides a basis for removing the layers in a segment-wise manner. In addition, our method innovatively adopts GradNorm for segment-wise layer importance evaluation, eliminating the need for extensive fine-tuning, and finally prunes the unimportant layers to obtain a compact network. Experimental results in image classification and for large language models (LLMs) demonstrate that our proposed SGLP outperforms the state-of-the-art methods in both accuracy and computational efficiency, presenting a more effective solution for deploying DNNs on resource-limited platforms. Our codes are available at <a class="link-external link-https" href="https://github.com/itsnotacie/information-fusion-SGLP" rel="external noopener nofollow">this https URL</a>.

Group $L_{1/2}$ Regularization for Pruning Hidden Layer Nodes of Feedforward Neural Networks

Structured Deep Neural Network Pruning by Varying Regularization Parameters.

Class-Aware Pruning for Efficient Neural Networks

The Role of Regularization in Shaping Weight and Node Pruning Dependency and Dynamics

LSOP: Layer-Scaled One-shot Pruning

Learning Sparse Neural Networks through L0 Regularization

Structured pruning for group regularized convolutional neural networks via dynamic regularization factor

Concurrent Training and Layer Pruning of Deep Neural Networks

Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon

Greedy Pruning with Group Lasso Provably Generalizes for Matrix Sensing

Efficient Network Compression Through Smooth-Lasso Constraint

AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models

SGLP: A Similarity Guided Fast Layer Partition Pruning for Compressing Large Deep Models

Hidden Synergy: $L_1$ Weight Normalization and 1-Path-Norm Regularization

RGP: Neural Network Pruning through Its Regular Graph Structure

Neural Network Reduction with Guided Regularizers

Training Compact DNNs with l 1 / 2 Regularization

Enhancing the Regularization Effect of Weight Pruning in Artificial Neural Networks

Group Sparse Optimization Via Lp, Q Regularization.

RGP: Neural Network Pruning Through Regular Graph With Edges Swapping

Learning k-Level Structured Sparse Neural Networks Using Group Envelope Regularization