Abstract:Deep learning (DL) applications have attracted significant attention with the rapidly growing demand for Internet of Things (IoT) systems. However, performing the inference tasks for DL applications on IoT devices is challenging due to the large computational demands of DL models. Recently, edge computing has offered us a solution by deploying resources near the end users. However, resources at the edge are still limited; thus, management issues, such as allocating the networking resources as well as the computing capabilities and configuring the devices appropriately for different applications, become essential. For knobs in such edge management, we consider multiple application tasks with different options of DL models and different hyperparameter settings, along with possible decomposition points that utilize the split DL concept to design the configuration tables. Layer-level decomposition in split DL provides greater flexibility by splitting a single DL inference model into parts on different computing devices, and each part consists of several consecutive layers. We then propose the SplitDL-Image and the SplitDL-Video algorithms based on the Vickrey–Clarke–Groves (VCG) mechanism by considering model performance and frames per second (FPS) requirements with the preferences of the heterogeneous IoT devices. The proposed method allocates networking and edge server computing resources according to the designed configuration tables by assigning the appropriate configuration to each IoT device. Simulation results based on real-world applications show that the proposed method indeed allocates more resources to IoT devices with more urgent/important tasks, preference for better accuracy, or higher local computational cost. In addition, other desired properties, such as truthful bidding, individual rationality, and weakly budget balance, are also guaranteed.

Efficient Communication-Computation Tradeoff for Split Computing: A Multi-Tier Deep Reinforcement Learning Approach

Optimum splitting computing for DNN training through next generation smart networks: a multi-tier deep reinforcement learning approach

Efficient Partitioning and Communication Scheme-Based Distributed Edge Computing to Accelerate Deep Neural Network

Dynamic Encoding and Decoding of Information for Split Learning in Mobile-Edge Computing: Leveraging Information Bottleneck Theory

An Efficient Split Learning Framework for Recurrent Neural Network in Mobile Edge Environment.

Accelerating Split Federated Learning over Wireless Communication Networks

Communication and Computation Reduction for Split Learning using Asynchronous Training

Distilled Split Deep Neural Networks for Edge-Assisted Real-Time Systems

Split Learning Over Wireless Networks: Parallel Design and Resource Management

Adaptive Layer Splitting for Wireless LLM Inference in Edge Computing: A Model-Based Reinforcement Learning Approach

SplitPlace: AI Augmented Splitting and Placement of Large-Scale Neural Networks in Mobile Edge Environments

Optimized Computation Offloading Performance in Virtual Edge Computing Systems via Deep Reinforcement Learning

Online Learning for Orchestration of Inference in Multi-User End-Edge-Cloud Networks

An Efficient Online Computation Offloading Approach for Large-Scale Mobile Edge Computing via Deep Reinforcement Learning

Deep reinforcement learning‐based multitask hybrid computing offloading for multiaccess edge computing

Edge–IoT Computing and Networking Resource Allocation for Decomposable Deep Learning Inference

Joint DNN partitioning and task offloading in mobile edge computing via deep reinforcement learning

Joint DNN partitioning and resource allocation for completion rate maximization of delay-aware DNN inference tasks in wireless powered mobile edge computing

Communication-Efficient Split Learning via Adaptive Feature-Wise Compression

Online Computation Offloading and Resource Scheduling in Mobile-Edge Computing

Communication-Efficient Distributed Deep Learning: A Comprehensive Survey