Abstract:With the growth of computer vision-based applications, an explosive amount of images have been uploaded to cloud servers that host such online computer vision algorithms, usually in the form of deep learning models. JPEG has been used as the de facto compression and encapsulation method for images. However, standard JPEG configuration does not always perform well for compressing images that are to be processed by a deep learning model—for example, the standard quality level of JPEG leads to 50% of size overhead (compared with the best quality level selection) on ImageNet under the same inference accuracy in popular computer vision models (e.g., InceptionNet and ResNet). Knowing this, designing a better JPEG configuration for online computer vision-based services is still extremely challenging. First, cloud-based computer vision models are usually a black box to end-users; thus, it is challenging to design JPEG configuration without knowing their model structures. Second, the “optimal” JPEG configuration is not fixed; instead, it is determined by confounding factors, including the characteristics of the input images and the model, the expected accuracy and image size, and so forth. In this article, we propose a reinforcement learning (RL)-based adaptive JPEG configuration framework, AdaCompress. In particular, we design an edge (i.e., user-side) RL agent that learns the optimal compression quality level to achieve an expected inference accuracy and upload image size, only from the online inference results, without knowing details of the model structures. Furthermore, we design an explore-exploit mechanism to let the framework fast switch an agent when it detects a performance degradation, mainly due to the input change (e.g., images captured across daytime and night). Our evaluation experiments using real-world online computer vision-based APIs from Amazon Rekognition, Face++, and Baidu Vision show that our approach outperforms existing baselines by reducing the size of images by one-half to one-third while the overall classification accuracy only decreases slightly. Meanwhile, AdaCompress adaptively re-trains or re-loads the RL agent promptly to maintain the performance.

Flexi-Compression: A Flexible Model Compression Method for Autonomous Driving

Multi-Dimension Compression of Feed-Forward Network in Vision Transformers

Towards Efficient Network Compression Via Few-Shot Slimming.

Efficient Network Compression Through Smooth-Lasso Constraint

A Model Compression Method Using Significant Data and Knowledge Distillation

Controllable Model Compression for Roadside Camera Depth Estimation

Deep learning model compression using network sensitivity and gradients

Automatic learning-based data optimization method for autonomous driving

Influence of AVC and HEVC Compression on Detection of Vehicles Through Faster R-CNN

Kernel-wise difference minimization for convolutional neural network compression in metaverse

Flexible Variable-Rate Image Feature Compression for Edge-Cloud Systems

AMC: AutoML for Model Compression and Acceleration on Mobile Devices

Compressing complex convolutional neural network based on an improved deep compression algorithm

Convolutional neural network model compression method

A Survey of Model Compression and Acceleration for Deep Neural Networks.

Low-Rate Feature Compression for Collaborative Intelligence: Reducing Redundancy in Spatial and Statistical Levels

Analysis of Model Compression Using Knowledge Distillation

Computer Vision Model Compression Techniques for Embedded Systems: A Survey

Adaptive Compression for Online Computer Vision: an Edge Reinforcement Learning Approach

Convolutional Neural Network Compression Based on Low-Rank Decomposition

VeriCompress: A Tool to Streamline the Synthesis of Verified Robust Compressed Neural Networks from Scratch