Convolutional Neural Network Compression Based on Low-Rank Decomposition

Yaping He,Linhao Jiang,Di Wu
2024-08-29
Abstract:Deep neural networks typically impose significant computational loads and memory consumption. Moreover, the large parameters pose constraints on deploying the model on edge devices such as embedded systems. Tensor decomposition offers a clear advantage in compressing large-scale weight tensors. Nevertheless, direct utilization of low-rank decomposition typically leads to significant accuracy loss. This paper proposes a model compression method that integrates Variational Bayesian Matrix Factorization (VBMF) with orthogonal regularization. Initially, the model undergoes over-parameterization and training, with orthogonal regularization applied to enhance its likelihood of achieving the accuracy of the original model. Secondly, VBMF is employed to estimate the rank of the weight tensor at each layer. Our framework is sufficiently general to apply to other convolutional neural networks and easily adaptable to incorporate other tensor decomposition methods. Experimental results show that for both high and low compression ratios, our compression model exhibits advanced performance.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the excessive computational load and memory consumption when deep neural networks (DNNs) are deployed on embedded devices. Specifically, the paper focuses on the compression problem of large - scale weight tensors in convolutional neural networks (CNNs). Although the existing tensor decomposition techniques can achieve thousands - of - times parameter compression in video tasks, in image classification tasks, especially when using Tensor Train (TT) and Tensor Ring (TR) decomposition, a slightly larger compression ratio will lead to a significant loss of accuracy. To solve these problems, the paper proposes a new model compression method, which combines variational Bayesian matrix factorization (VBMF) and orthogonal regularization. This method aims to improve the performance of the compressed model through the following steps: 1. **Over - parameterized training and orthogonal regularization**: First, over - parameterize the training of the model and impose orthogonal regularization to ensure that the model can reach or exceed the accuracy of the original model. 2. **VBMF to estimate the rank**: Then use VBMF to estimate the rank of the weight tensor of each layer. 3. **Low - rank training**: Finally, conduct low - rank training to obtain the compressed model. The main contributions of the paper include: - Proposing a framework that combines over - parameterized training and orthogonal regularization, which not only provides better initial values but also ensures orthogonality. - Using VBMF to estimate the rank of one modality in TK - 2 decomposition, and the other modality is determined according to the relationship between the input and output channels of the convolutional neural network. - The experimental results on multiple DNN models show that the compressed model exhibits excellent performance at both high and low compression ratios. Through these improvements, the paper solves the limitations of the existing tensor decomposition methods in CNN compression, especially achieving efficient compression while maintaining the accuracy of the model.