Abstract:Residual networks usually require more layers to achieve remarkable performance in complex visual categorization tasks, such as pose estimation. However, the increasing number of layers leads to a heavy burden on training and forward inference as well as over-fitting. This paper proposed local binary residual block (LBB) to promote the very deep residual networks on the trainable parameters, FLOPs and accuracy. In each LBB, the <math>3×3</math> filters are binarized based on Bernoulli distribution under a sparse constraint, an activation function is prepared to trigger the non-linear response, and the linear <math>1×1</math> filters are learned in a real-valued way. After stochastic binarized initialization, the <math>3×3</math> filters in LBB need not be updated during training. The above architecture reduces at least 69.2% trainable parameters and 70.5% FLOPs compared to the original model. The LBB is derived from three observations: 1) Activated responses of one standard <math>k×k</math> convolutional layer can be approximated by combining binarized <math>k×k</math> filters with <math>1×1</math> filters; 2) Most computation in the very deep residual networks is spent on the <math>3×3</math> convolutions; and 3) <math>1×1</math> filters play an important role in cross-channel information integration. In addition, the LBB module is suitable for the very deep network framework, including stacked hourglass network and pyramid residual modules. Experiments are conducted on MPII and LSP dataset for pose estimation task; CIFAR-10, CIFAR-100 and ImageNet datasets for object recognition; ECSSD, HKU-IS, PASCAL-S, DUT-OMRON, DUTS for saliency detection. The results show that our model can accelerate the training and inference of the network with only a slight performance degradation.

Local-binarized very deep residual network for visual categorization

Image Semantic Segmentation Based on Region and Deep Residual Network

LWRN: Light-Weight Residual Network for Edge Detection

A lightweight and stochastic depth residual attention network for remote sensing scene classification

Eye Localization Based on Weight Binarization Cascade Convolution Neural Network

Local Binary Convolutional Neural Networks

BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition

Residual Feature-Reutilization Inception Network

Residual Networks of Residual Networks: Multilevel Residual Networks

Weighted Residuals for Very Deep Networks

LRPRNet: Lightweight Deep Network by Low-Rank Pointwise Residual Convolution

Network Binarization via Contrastive Learning

BDC-Occ: Binarized Deep Convolution Unit For Binarized Occupancy Network

SBNN: Slimming binarized neural network

Single-stage object detector with local binary pattern for remote sensing images

LPRNet: Lightweight Deep Network by Low-rank Pointwise Residual Convolution

From Local Binary Patterns to Pixel Difference Networks for Efficient Visual Representation Learning

ParaLkResNet: an efficient multi-scale image classification network

Deep Convolutional Neural Networks with Merge-and-Run Mappings

Embarrassingly Simple Binarization for Deep Single Imagery Super-Resolution Networks

Improved Residual Networks for Image and Video Recognition