Abstract:The advent of Deep Neural Networks (DNNs) has empowered numerous computer-vision applications. Due to the high computational intensity of DNN models, as well as the resource constrained nature of Industrial Internet-of-Things (IIoT) devices, it is generally very challenging to deploy and execute DNNs efficiently in the industrial scenarios. Substantial research has focused on model compression or edge-cloud offloading, which trades off accuracy for efficiency or depends on high-quality infrastructure support, respectively. In this article, we present EdgeDI, a framework for executing DNN inference in a partitioned, distributed manner on a cluster of IIoT devices. To improve the inference performance, EdgeDI exploits two key optimization knobs, including: (1) Model compression based on deep architecture design, which transforms the target DNN model into a compact one that reduces the resource requirements for IIoT devices without sacrificing accuracy; (2) Distributed inference based on adaptive workload partitioning, which achieves high parallelism by adaptively balancing the workload distribution among IIoT devices under heterogeneous resource conditions. We have implemented EdgeDI based on PyTorch, and evaluated its performance with the NEU-CLS defect classification task and two typical DNN models (i.e., VGG and ResNet) on a cluster of heterogeneous Raspberry Pi devices. The results indicate that the proposed two optimization approaches significantly outperform the existing solutions in their specific domains. When they are well combined, EdgeDI can provide scalable DNN inference speedups that are very close to or even much higher than the theoretical speedup bounds, while still maintaining the desired accuracy.

DeepThings: Distributed Adaptive Deep Learning Inference on Resource-Constrained IoT Edge Clusters

Efficient Partitioning and Communication Scheme-Based Distributed Edge Computing to Accelerate Deep Neural Network

DeeperThings: Fully Distributed CNN Inference on Resource-Constrained Edge Devices

Extendable Multi-Device Collaborative Pipeline Parallel Inference in the Edge-Cloud Scenario

EdgeCI: Distributed Workload Assignment and Model Partitioning for CNN Inference on Edge Clusters

Low Latency Deep Learning Inference Model for Distributed Intelligent IoT Edge Clusters

Adaptive Distributed Convolutional Neural Network Inference at the Network Edge with ADCNN

Joint Architecture Design and Workload Partitioning for DNN Inference on Industrial IoT Clusters

Automated Exploration and Implementation of Distributed CNN Inference at the Edge

DistrEdge: Speeding up Convolutional Neural Network Inference on Distributed Edge Devices

Toward Collaborative Inferencing of Deep Neural Networks on Internet-of-Things Devices

CoEdge: Cooperative DNN Inference With Adaptive Workload Partitioning Over Heterogeneous Edge Devices

Self-aware distributed deep learning framework for heterogeneous IoT edge devices

Design and Prototyping Distributed CNN Inference Acceleration in Edge Computing

EdgeSP: Scalable Multi-device Parallel DNN Inference on Heterogeneous Edge Clusters

Collaborative Execution of Deep Neural Networks on Internet of Things Devices

End-Edge Collaborative Inference of Convolutional Fuzzy Neural Networks for Big Data-Driven Internet of Things

AutoDiCE: Fully Automated Distributed CNN Inference at the Edge

Enhancing Distributed In-Situ CNN Inference in the Internet of Things

Adaptive Device-Edge Collaboration on DNN Inference in AIoT: A Digital Twin-Assisted Approach