Efficient Neural Image Decoding Via Fixed-Point Inference

Weixin Hong,Tong Chen,Ming Lu,Shiliang Pu,Zhan Ma
DOI: https://doi.org/10.1109/tcsvt.2020.3040367
IF: 5.859
2020-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Recent learned image coding has emerged with superior efficiency to conventional methods. It, however, is criticized for its complexity-exhaustive deep neural network (DNN) architectures, especially on resource-constrained mobile platforms. Thus we devise a two-stage approach: first, a range pre-processing is applied to constrain the dynamic range of feature map activation by leveraging its sparsity nature with densely clustered distribution, then a layer-wise range-adaptive quantization for convolutional parameter (e.g., weight, bias), and simple yet efficient linear scaling and range-dependent normalization for activation are executed, leading to a fully fixed-point inference architecture. All arithmetic operations and associated data tensors are processed using low-bit-width fixed-point numbers, yielding significant reductions of the computational complexity, memory space, and the elimination of platform-dependent inconsistency induced by floating-point operations. We first exemplify such fixed-point inference in a DNN-based image decoder, showing the comparable coding efficiency with its native floating-point model, against the same anchor using the High-Efficiency Video Coding (HEVC)-based intra image coder. We also extend proposed approach to the super-resolution network for learned resolution scaling-based video streaming, and VGG network-based classification tasks, both of which present negligible performance loss. These evidence the generalization of our approach for efficient DNN processing of various tasks. All materials are made publicly accessible at http://njuvision.github.io/fixed-point/.
What problem does this paper attempt to address?