Abstract:Model binarization is an effective method of compressing neural networks and accelerating their inference process. However, a significant performance gap still exists between the 1-bit model and the 32-bit one. The empirical study shows that binarization causes a great loss of information in the forward and backward propagation. We present a novel Distribution-sensitive Information Retention Network (DIR-Net) that retains the information in the forward and backward propagation by improving internal propagation and introducing external representations. The DIR-Net mainly relies on three technical contributions: (1) Information Maximized Binarization (IMB): minimizing the information loss and the binarization error of weights/activations simultaneously by weight balance and standardization; (2) Distribution-sensitive Two-stage Estimator (DTE): retaining the information of gradients by distribution-sensitive soft approximation by jointly considering the updating capability and accurate gradient; (3) Representation-align Binarization-aware Distillation (RBD): retaining the representation information by distilling the representations between full-precision and binarized networks. The DIR-Net investigates both forward and backward processes of BNNs from the unified information perspective, thereby providing new insight into the mechanism of network binarization. The three techniques in our DIR-Net are versatile and effective and can be applied in various structures to improve BNNs. Comprehensive experiments on the image classification and objective detection tasks show that our DIR-Net consistently outperforms the state-of-the-art binarization approaches under mainstream and compact architectures, such as ResNet, VGG, EfficientNet, DARTS, and MobileNet. Additionally, we conduct our DIR-Net on real-world resource-limited devices which achieves 11.1x storage saving and 5.4x speedup.

TP-ADMM: An Efficient Two-Stage Framework for Training Binary Neural Networks

AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets

An adiabatic method to train binarized artificial neural networks

Efficient and Robust Mixed-Integer Optimization Methods for Training Binarized Deep Neural Networks

A Convergent ADMM Framework for Efficient Neural Network Training

Toward Extremely Low Bit and Lossless Accuracy in DNNs with Progressive ADMM

Deep Spiking Neural Networks with Binary Weights for Object Recognition

Reweighted Alternating Direction Method of Multipliers for DNN weight pruning

Training Deep Neural Networks with Discrete State Transition

BinaryConnect: Training Deep Neural Networks with binary weights during propagations

A foundation for exact binarized morphological neural networks

Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates using ADMM

Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1

A Systematic DNN Weight Pruning Framework Using Alternating Direction Method of Multipliers

Boosting Binary Neural Networks via Dynamic Thresholds Learning

Distribution-sensitive Information Retention for Accurate Binary Neural Network

Binary Early-Exit Network for Adaptive Inference on Low-Resource Devices

Towards Efficient Tensor Decomposition-Based DNN Model Compression with Optimization Framework

Network Binarization via Contrastive Learning

Efficient Multitask Dense Predictor via Binarization

Iterative Training: Finding Binary Weight Deep Neural Networks with Layer Binarization