Abstract:Deep learning (DL) applications have attracted significant attention with the rapidly growing demand for Internet of Things (IoT) systems. However, performing the inference tasks for DL applications on IoT devices is challenging due to the large computational demands of DL models. Recently, edge computing has offered us a solution by deploying resources near the end users. However, resources at the edge are still limited; thus, management issues, such as allocating the networking resources as well as the computing capabilities and configuring the devices appropriately for different applications, become essential. For knobs in such edge management, we consider multiple application tasks with different options of DL models and different hyperparameter settings, along with possible decomposition points that utilize the split DL concept to design the configuration tables. Layer-level decomposition in split DL provides greater flexibility by splitting a single DL inference model into parts on different computing devices, and each part consists of several consecutive layers. We then propose the SplitDL-Image and the SplitDL-Video algorithms based on the Vickrey–Clarke–Groves (VCG) mechanism by considering model performance and frames per second (FPS) requirements with the preferences of the heterogeneous IoT devices. The proposed method allocates networking and edge server computing resources according to the designed configuration tables by assigning the appropriate configuration to each IoT device. Simulation results based on real-world applications show that the proposed method indeed allocates more resources to IoT devices with more urgent/important tasks, preference for better accuracy, or higher local computational cost. In addition, other desired properties, such as truthful bidding, individual rationality, and weakly budget balance, are also guaranteed.

Joint Optimization of Model Partitioning and Resource Allocation for Edge Computing with Intermittently Operating Devices

Joint Optimization With DNN Partitioning and Resource Allocation in Mobile Edge Computing

Efficient Partitioning and Communication Scheme-Based Distributed Edge Computing to Accelerate Deep Neural Network

Extendable Multi-Device Collaborative Pipeline Parallel Inference in the Edge-Cloud Scenario

Joint Optimization of Device Placement and Model Partitioning for Cooperative DNN Inference in Heterogeneous Edge Computing

Joint multi-user DNN partitioning and task offloading in mobile edge computing

Joint DNN Partition and Resource Allocation for Task Offloading in Edge-Cloud-Assisted IoT Environments

Joint Multi-User DNN Partitioning and Computational Resource Allocation for Collaborative Edge Intelligence

Joint DNN partitioning and resource allocation for completion rate maximization of delay-aware DNN inference tasks in wireless powered mobile edge computing

Dynamic Resource Allocation for Jointing Vehicle-Edge Deep Neural Network Inference

Energy-Efficient DNN Partitioning and Offloading for Task Completion Rate Maximization in Multiuser Edge Intelligence

Model Parallelism Optimization for Distributed DNN Inference on Edge Devices.

Towards Resource-aware DNN Partitioning for Edge Devices with Heterogeneous Resources

A DNN inference acceleration algorithm combining model partition and task allocation in heterogeneous edge computing system

Conflict-Resilient Incremental Offloading of Deep Neural Networks to the Edge of Smart Environment

Energy-Efficient Joint Partitioning and Offloading for Delay-Sensitive CNN Inference in Edge Computing

Joint DNN Partition and Resource Allocation Optimization for Energy-Constrained Hierarchical Edge-Cloud Systems

Enabling Latency-Sensitive DNN Inference Via Joint Optimization of Model Surgery and Resource Allocation in Heterogeneous Edge

Edge–IoT Computing and Networking Resource Allocation for Decomposable Deep Learning Inference

Joint scheduling and offloading of computational tasks with time dependency under edge computing networks

Joint Job Offloading and Resource Allocation for Distributed Deep Learning in Edge Computing.