Joint Optimization of Model Partitioning and Resource Allocation for Edge Computing with Intermittently Operating Devices

Qi Wu,Yi Zhang,Chenxi Yang,Jin Sun
DOI: https://doi.org/10.1109/icpads60453.2023.00295
2023-01-01
Abstract:The new intermittent computing paradigm allows for intermittent operation of energy-harvesting devices, posing new challenges to edge intelligence in delivering high-quality computing services. This paper aims at the joint optimization of model partitioning and resource allocation for reducing the latency of deep neural network (DNN) applications on an edge computing system with intermittently operating devices. We establish a rigorous optimization model for the joint optimization problem that takes into account the heterogeneity and intermittent operation of end devices. Tailed for the inference procedure of DNN applications, we develop a family of partitioning rules for decomposing the DNN structure to facilitate computation offloading. We propose a model partitioning and resource allocation algorithm to determine the optimized assignment of computing resources for the DNN tasks offloaded from multiple devices onto the edge server. The proposed algorithm first utilizes the partitioning rules to obtain a preliminary decision on model partitioning, and introduces a greedy-based strategy to determine the final decision on the partitioning points of DNN structures as well as the amount of computing resources assigned for task execution. Simulation results on an edge system with heterogeneous devices, including Raspberry Pi 3B+, Raspberry Pi 4B, and Jetson Xavier NX, demonstrating that the proposed algorithm outperforms baseline methods with shorter latency when executing DNN inference tasks.
What problem does this paper attempt to address?