Abstract:Many intelligent applications based on deep neural networks are increasingly running on Internet of Things (IoT) devices. Unfortunately, the computing resources of these IoT devices are limited, which will seriously hinder the widespread deployment of various smart applications. A popular solution is to offload part of computation tasks from IoT device to cloud by way of device-cloud collaboration. However, existing collaboration approaches may suffer from long network transmission delay or degraded accuracy due to the large amount of intermediate results, bring enormous challenges to the tasks such as object detection that require massive computing resources. In this paper, we propose an efficient Device-Cloud Collaborative Inference (DCCI) object detection framework, which dynamically adjusts the amount of transferred data according to the content of input images. Specifically, a content-aware hard-case discriminator is proposed to automatically classify the input images as hard-cases or simple-cases, the hard-cases are uploaded to the cloud to be processed by a deployed heavyweight model, and the simple cases are processed by a light-weight model deployed to the IoT device, where the light-weight model is automatically compressed based on reinforcement learning according to the resource constraints of the IoT device. Furthermore, a collaborative scheduler based on the run-time load and network transmission capability of IoT devices is proposed to optimize the collaborative computation between IoT devices and the cloud. Extensive experimental evaluations show that compared to the Device-only approach, DCCI can reduce the memory footprint and compute resources of IoT devices by more than 90.0% and 30.87%, respectively. Compared to Cloud-centric, DCCI can save 2.0× of network bandwidth. In addition, compared with the state-of-the-art DNN partitioning method, DCCI can save 1.2× of inference latency, and 1.3× of IoT device energy consumption with the same accuracy constraint.

Multi-Vision Services Acceleration Framework for IoT Devices

A Scalable Real-time Semantic Segmentation Network for Autonomous Driving

Multi-Compression Scale DNN Inference Acceleration based on Cloud-Edge-End Collaboration

ABM-SpConv-SIMD: Accelerating Convolutional Neural Network Inference for Industrial IoT Applications on Edge Devices

Multi-path Neural Networks for On-device Multi-domain Visual Classification

Towards Diversified IoT Services in Mobile Edge Computing

Toward Collaborative Inferencing of Deep Neural Networks on Internet-of-Things Devices

Collaborative Inference for MEC Services Based on Multimodal Deep Neural Network.

Capsule Network Distributed Learning with Multi-Access Edge Computing for the Internet of Vehicles

A Cloud-Edge Collaboration Framework for Cognitive Service.

EdgeCI: Distributed Workload Assignment and Model Partitioning for CNN Inference on Edge Clusters

CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN Workloads

MobileMamba: Lightweight Multi-Receptive Visual Mamba Network

MCUNet: Tiny Deep Learning on IoT Devices

Content-Aware Adaptive Device-Cloud Collaborative Inference for Object Detection

The analysis of intelligent real-time image recognition technology based on mobile edge computing and deep learning

A Fine-Grained End-to-End Latency Optimization Framework for Wireless Collaborative Inference

Mobip: a lightweight model for driving perception using MobileNet

Variant Parallelism: Lightweight Deep Convolutional Models for Distributed Inference on IoT Devices

DistrEdge: Speeding up Convolutional Neural Network Inference on Distributed Edge Devices

Automated Exploration and Implementation of Distributed CNN Inference at the Edge