Abstract:In this paper, we propose to train a network with both binary weights and binary activations, designed specifically for mobile devices with limited computation capacity and power consumption. Previous works on quantizing CNNs uncritically assume the same architecture with full-precision networks, which we term value approximation. Their objective is to preserve the floating-point information using a set of discrete values. However, we take a novel view---for best performance it is very likely that a different architecture may be better suited to deal with binary weights as well as binary activations. Thus we directly design such a highly accurate binary network structure, which is termed structure approximation. In particular, we propose a network decomposition strategy in which we divide the networks into groups and aggregate a set of homogeneous binary branches to implicitly reconstruct the full-precision intermediate feature maps. In addition, we also learn the connections between each group. We further provide a comprehensive comparison among all quantization categories. Experiments on ImageNet classification tasks demonstrate the superior performance of the proposed model, named Group-Net, over various popular architectures. In particular, we outperform the previous best binary neural network in terms of accuracy as well as saving huge computational complexity. Furthermore, the proposed Group-Net can effectively utilize task specific properties for strong generalization. In particular, we propose to extend Group-Net for \textbf{lossless} semantic segmentation. This is the first work proposed on solving dense pixels prediction based on BNNs in the literature. Actually, we claim that considering both value and structure approximation should be the future development direction of BNNs.

Model encoding of binary neural networks

Residual Quantization for Low Bit-Width Neural Networks.

Mix-precision Model Encoding and Quantization

Efficient Quantization for Neural Networks with Binary Weights and Low Bitwidth Activations.

Structured Binary Neural Networks for Image Recognition

Binary Convolutional Neural Network with High Accuracy and Compression Rate

Training Compact Neural Networks with Binary Weights and Low Precision Activations

Mixed-Precision Quantized Neural Network with Progressively Decreasing Bitwidth For Image Classification and Object Detection.

Learning to Binarize Convolutional Neural Networks with Adaptive Neural Encoder

A Novel Low-Bit Quantization Strategy for Compressing Deep Neural Networks

Adaptive Binary-Ternary Quantization

Cellular Binary Neural Network for Accurate Image Classification and Semantic Segmentation

Bit-Quantized-Net: an Effective Method for Compressing Deep Neural Networks.

Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation.

Iterative Training: Finding Binary Weight Deep Neural Networks with Layer Binarization

Sub-bit Neural Networks: Learning to Compress and Accelerate Binary Neural Networks

Model compression as constrained optimization, with application to neural nets. Part II: quantization

Efficient Neural Compression with Inference-time Decoding

Bit Efficient Quantization for Deep Neural Networks

Group Binary Weight Networks

Increasing Information Entropy of Both Weights and Activations for the Binary Neural Networks