Abstract:To distribute the storage and computation load caused by growing capacity of deep neural network (DNN), collaborative intelligence (CI) framework has been proposed, where a deep model is split and executed in two distributed devices respectively. Intermediate feature must be transferred from the front end to the back in order to perform distributed inference, thus transmission process is the bottleneck that influences the inference efficiency in terms of accuracy and delay. Specifically for a bandwidth-limited human-in-loop visual analysis task, feature compression approach needs exploration to reduce the data volume to be transmitted, in order to achieve low transmission delay as well as maintain analysis performance and human perception ability. In this paper, the redundancy of intermediate feature both in spatial and statistical levels are firstly analyzed. A mathematical expression for the goal of feature compression is formulated, based on which a two-level redundancy removal based low-rate feature compression approach is proposed. For the front-end device, an information squeezing (IS) module is developed to squeeze the key information of input image and inject them into a low-resolution image. Then a backbone network is split into two parts with respects to the application demands of CI, and can be deployed at the front and back ends correspondingly. With a specifically designed objective function, IS module and the partitioned backbone network are optimized collaboratively to reduce the two-level redundancy, thus compressing the intermediate feature. A generative adversarial network (GAN)-based restoration module is proposed to recover an image with original resolution from the compressed feature, for satisfying human perception. Comprehensive experiments are conduct to validate the efficiency of the proposed method.

Agnostic Feature Compression with Semantic Guided Channel Importance Analysis

A feature compression method based on similarity matching

Sensitivity-Aware Bit Allocation for Intermediate Deep Feature Compression.

Intermediate Deep Feature Compression: the Next Battlefield of Intelligent Sensing

Toward Intelligent Sensing: Intermediate Deep Feature Compression

Lossy Intermediate Deep Learning Feature Compression and Evaluation

Low-Rate Feature Compression for Collaborative Intelligence: Reducing Redundancy in Spatial and Statistical Levels

An End-to-End Channel-Adaptive Feature Compression Approach in Device-Edge Co-Inference Systems

Attention-based Feature Compression for CNN Inference Offloading in Edge Computing

Lightweight compression of neural network feature tensors for collaborative intelligence

FrankenSplit: Efficient Neural Feature Compression with Shallow Variational Bottleneck Injection for Mobile Edge Computing

Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach

A Broad-Spectrum and High-Throughput Compression Engine for Neural Network Processors

An Efficient CNN Inference Accelerator Based on Intra- and Inter-Channel Feature Map Compression

2C-Net: integrate image compression and classification via deep neural network

Lightweight Compression of Intermediate Neural Network Features for Collaborative Intelligence

Deep learning model compression using network sensitivity and gradients

Semantically Scalable Image Coding With Compression Of Feature Maps

ASC: Adaptive Scale Feature Map Compression for Deep Neural Network

Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization

Application Specific Compression of Deep Learning Models