Abstract:Model binarization is an effective method of compressing neural networks and accelerating their inference process. However, a significant performance gap still exists between the 1-bit model and the 32-bit one. The empirical study shows that binarization causes a great loss of information in the forward and backward propagation. We present a novel Distribution-sensitive Information Retention Network (DIR-Net) that retains the information in the forward and backward propagation by improving internal propagation and introducing external representations. The DIR-Net mainly relies on three technical contributions: (1) Information Maximized Binarization (IMB): minimizing the information loss and the binarization error of weights/activations simultaneously by weight balance and standardization; (2) Distribution-sensitive Two-stage Estimator (DTE): retaining the information of gradients by distribution-sensitive soft approximation by jointly considering the updating capability and accurate gradient; (3) Representation-align Binarization-aware Distillation (RBD): retaining the representation information by distilling the representations between full-precision and binarized networks. The DIR-Net investigates both forward and backward processes of BNNs from the unified information perspective, thereby providing new insight into the mechanism of network binarization. The three techniques in our DIR-Net are versatile and effective and can be applied in various structures to improve BNNs. Comprehensive experiments on the image classification and objective detection tasks show that our DIR-Net consistently outperforms the state-of-the-art binarization approaches under mainstream and compact architectures, such as ResNet, VGG, EfficientNet, DARTS, and MobileNet. Additionally, we conduct our DIR-Net on real-world resource-limited devices which achieves 11.1x storage saving and 5.4x speedup.

Binary Convolutional Neural Network with High Accuracy and Compression Rate

How to Train A Compact Binary Neural Network with High Accuracy?

A Highly Efficient Training-Aware Convolutional Neural Network Compression Paradigm

Training Binary Neural Networks with Real-to-Binary Convolutions

Learning to Binarize Convolutional Neural Networks with Adaptive Neural Encoder

Model encoding of binary neural networks

Gradient Corrected Approximation for Binary Neural Networks.

Sub-bit Neural Networks: Learning to Compress and Accelerate Binary Neural Networks

AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets

Network Binarization via Contrastive Learning

Forward and Backward Information Retention for Accurate Binary Neural Networks

Neural Network Compression using Binarization and Few Full-Precision Weights

Gradient Matters: Designing Binarized Neural Networks Via Enhanced Information-Flow

Distribution-sensitive Information Retention for Accurate Binary Neural Network

Bi-Real Net V2: Rethinking Non-linearity for 1-Bit CNNs and Going Beyond

A Novel Binary Neural Network with Enhanced Dense Connection

An adiabatic method to train binarized artificial neural networks

Efficient Binary 3D Convolutional Neural Network and Hardware Accelerator.

Modulated Convolutional Networks

IR-Net: Forward and Backward Information Retention for Highly Accurate Binary Neural Networks

Group Binary Weight Networks