Abstract:Convolutional neural networks (CNNs) have been widely used in many tasks, but training CNNs is time-consuming and energy-hungry. Using the low-bit integer format has been proved promising for speeding up and improving the energy efficiency of CNN inference, while the training phase of CNNs can hardly benefit from such a technique because of following challenges: (1) The integer data format cannot meet the requirements of the data dynamic range in training, resulting in the accuracy drop; (2) The floating-point data format keeps large dynamic range with much more exponent bits, resulting in higher accumulation power than integer one; (3) There are some specially designed data formats (e.g., with group-wise scaling) that have the potential to deal with the former two problems but the common hardware can not support them efficiently. To tackle all these challenges and make the training phase of CNNs benefit from the low-bit format, we propose a low-bit training framework for convolutional neural networks to pursue a better trade-off between the accuracy and energy efficiency. (1) We adopt element-wise scaling to improve the dynamic range of data representation, which greatly reduces the quantization error; (2) Group-wise scaling with hardware friendly factor format is designed to reduce the element-wise exponent bits without degrading the accuracy; (3) We design the customized hardware unit that implement the low-bit tensor convolution arithmetic with our multi-level scaling data format. Experiments show that our framework achieves a superior trade-off between the accuracy and the bit-width than previous low-bit training studies. For training a variety of models on CIFAR-10, using 1-bit mantissa and 2-bit exponent is adequate to keep the accuracy loss within 1%. And on larger datasets like ImageNet, using 4-bit mantissa and 2-bit exponent is adequate. Through the energy consumption simulation of the computing units,we can estimate that training a variety of models with our framework could achieve 8.3 ∼ 10.2× and 1.9 ∼ 2.3× higher energy efficiency than single-precision and 8-bit floating-point arithmetic, respectively.

A CNN Channel Pruning Low-Bit Framework Using Weight Quantization with Sparse Group Lasso Regularization.

SUBP: Soft Uniform Block Pruning for 1 X N Sparse CNNs Multithreading Acceleration

Single-shot Pruning and Quantization for Hardware-Friendly Neural Network Acceleration

SUBP: Soft Uniform Block Pruning for 1xn Sparse CNNs Multithreading Acceleration

A Pruning Method Based on the Dissimilarity of Angle among Channels and Filters

Learning Low Resource Consumption CNN through Pruning and Quantization

Efficient Network Compression Through Smooth-Lasso Constraint

An Efficient Channel-level Pruning for CNNs without Fine-tuning

Focused Quantization for Sparse CNNs

Improving Network Slimming with Nonconvex Regularization

Prune the Convolutional Neural Networks with Sparse Shrink

Exploiting Weight-Level Sparsity in Channel Pruning with Low-Rank Approximation

Low-precision CNN Model Quantization based on Optimal Scaling Factor Estimation

Automatic channel pruning via clustering and swarm intelligence optimization for CNN

Learning Efficient Convolutional Networks Through Network Slimming.

Latency-aware Automatic CNN Channel Pruning with GPU Runtime Analysis

A Unified Framework of DNN Weight Pruning and Weight Clustering/Quantization Using ADMM

KCNN: Kernel-wise Quantization to Remarkably Decrease Multiplications in Convolutional Neural Network.

Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights

Exploiting Channel Similarity for Network Pruning.

Exploring the Potential of Low-bit Training of Convolutional Neural Networks