Abstract:Edge computing offers an additional layer of compute infrastructure closer to the data source before raw data from privacy-sensitive and performance-critical applications is transferred to a cloud data center. Deep Neural Networks (DNNs) are one class of applications that are reported to benefit from collaboratively computing between the edge and the cloud. A DNN is partitioned such that specific layers of the DNN are deployed onto the edge and the cloud to meet performance and privacy objectives. However, there is limited understanding of: (a) whether and how evolving operational conditions (increased CPU and memory utilization at the edge or reduced data transfer rates between the edge and the cloud) affect the performance of already deployed DNNs, and (b) whether a new partition configuration is required to maximize performance. A DNN that adapts to changing operational conditions is referred to as an 'adaptive DNN'. This paper investigates whether there is a case for adaptive DNNs in edge computing by considering three questions: (i) Are DNNs sensitive to operational conditions? (ii) How sensitive are DNNs to operational conditions? (iii) Do individual or a combination of operational conditions equally affect DNNs? (iv) Is DNN partitioning sensitive to hardware architectures on the cloud/edge? The exploration is carried out in the context of 8 pre-trained DNN models and the results presented are from analyzing nearly 8 million data points. The results highlight that network conditions affects DNN performance more than CPU or memory related operational conditions. Repartitioning is noted to provide a performance gain in a number of cases, but a specific trend was not noted in relation to its correlation to the underlying hardware architecture. Nonetheless, the need for adaptive DNNs is confirmed.

System Support and Mechanisms for Adaptive Edge-to-cloud DNN Model Serving

Extendable Multi-Device Collaborative Pipeline Parallel Inference in the Edge-Cloud Scenario

Efficient Partitioning and Communication Scheme-Based Distributed Edge Computing to Accelerate Deep Neural Network

Condense: A Framework for Device and Frequency Adaptive Neural Network Models on the Edge.

A Case For Adaptive Deep Neural Networks in Edge Computing

Model Parallelism Optimization for Distributed DNN Inference on Edge Devices.

The Case for Adaptive Deep Neural Networks in Edge Computing

Edge Intelligence: On-Demand Deep Learning Model Co-Inference with Device-Edge Synergy

Adaptive Deep Inference Framework for Cloud-Edge Collaboration

Accelerating DNN Inference by Edge-Cloud Collaboration

Online Learning for Orchestration of Inference in Multi-User End-Edge-Cloud Networks

An Adaptive DNN Inference Acceleration Framework with End–edge–cloud Collaborative Computing

Accelerating Deep Neural Network Tasks Through Edge-Device Adaptive Inference

Partitioning and Deployment of Deep Neural Networks on Edge Clusters

Distributed Deep Neural Networks over the Cloud, the Edge and End Devices

Context-Aware Deep Model Compression for Edge Cloud Computing

DECC: Delay-Aware Edge-Cloud Collaboration for Accelerating DNN Inference

An Adaptive Task Migration Scheduling Approach for Edge-Cloud Collaborative Inference

A Novel Adaptive Computation Offloading Strategy for Collaborative DNN Inference over Edge Devices.

Collaborative DNNs Inference with Joint Model Partition and Compression in Mobile Edge-Cloud Computing Networks

Distilled Split Deep Neural Networks for Edge-Assisted Real-Time Systems