Automatic Rank Selection for High-Speed Convolutional Neural Network

Hyeji Kim,Chong-Min Kyung

DOI: https://doi.org/10.48550/arXiv.1806.10821

2018-06-29

Abstract:Low-rank decomposition plays a central role in accelerating convolutional neural network (CNN), and the rank of decomposed kernel-tensor is a key parameter that determines the complexity and accuracy of a neural network. In this paper, we define rank selection as a combinatorial optimization problem and propose a methodology to minimize network complexity while maintaining the desired accuracy. Combinatorial optimization is not feasible due to search space limitations. To restrict the search space and obtain the optimal rank, we define the space constraint parameters with a boundary condition. We also propose a linearly-approximated accuracy function to predict the fine-tuned accuracy of the optimized CNN model during the cost reduction. Experimental results on AlexNet and VGG-16 show that the proposed rank selection algorithm satisfies the accuracy constraint. Our method combined with truncated-SVD outperforms state-of-the-art methods in terms of inference and training time at almost the same accuracy.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to minimize the complexity of convolutional neural networks (CNNs) while maintaining the required accuracy. Specifically, the paper focuses on the rank selection problem in low - rank decomposition, that is, how to determine the rank of the decomposed kernel tensor so as to maintain the classification accuracy of the network while reducing the computational complexity and memory usage. The paper points out that the rank is a key parameter that determines the complexity of each layer and directly affects memory usage, running time and accuracy. Therefore, a new model - level rank - selection algorithm is proposed, aiming to minimize network complexity through combinatorial optimization methods while meeting the predetermined accuracy requirements. In addition, the paper also defines the accuracy function of linear approximation, which is used to predict the accuracy of the optimized CNN model after fine - tuning during the cost - reduction process. Experimental results show that the proposed rank - selection algorithm combined with the truncated singular value decomposition (SVD) method is superior to existing methods for accelerating deep networks at almost the same accuracy.

Automatic Rank Selection for High-Speed Convolutional Neural Network

Speeding-up and compression convolutional neural networks by low-rank decomposition without fine-tuning

Convolutional neural networks with low-rank regularization

Convolutional Neural Network Compression via Dynamic Parameter Rank Pruning

Unified Framework for Neural Network Compression via Decomposition and Optimal Rank Selection

Stable Low-rank Tensor Decomposition for Compression of Convolutional Neural Network

CNN Acceleration by Low-rank Approximation with Quantized Factors

Sensitivity-based Acceleration and Compression Algorithm for Convolution Neural Network.

Convolutional neural networks compression with low rank and sparse tensor decompositions

Convolutional Neural Network Compression Based on Low-Rank Decomposition

ELRT: Efficient Low-Rank Training for Compact Convolutional Neural Networks

Low-rank Tensor Decomposition for Compression of Convolutional Neural Networks Using Funnel Regularization

HALOC: Hardware-Aware Automatic Low-Rank Compression for Compact Neural Networks

Low Rank Optimization for Efficient Deep Learning: Making A Balance between Compact Architecture and Fast Training

Holistic CNN Compression Via Low-Rank Decomposition with Knowledge Transfer.

A Model Compression Method Using Significant Data and Knowledge Distillation

Fast Low-Rank Matrix Learning with Nonconvex Regularization

Reduced storage direct tensor ring decomposition for convolutional neural networks compression

Bayesian tensorized neural networks with automatic rank selection

Accelerated Gradient Method for A Class of Nonconvex Low Rank Problem: Essentially Matching the Optimal Convex Convergence Rate

A continuous-time neurodynamic approach in matrix form for rank minimization