Joint Batching and Scheduling for High-Throughput Multiuser Edge AI with Asynchronous Task Arrivals

Yihan Cang,Ming Chen,Kaibin Huang
2023-07-15
Abstract:In this paper, we study joint batching and (task) scheduling to maximise the throughput (i.e., the number of completed tasks) under the practical assumptions of heterogeneous task arrivals and deadlines. The design aims to optimise the number of batches, their starting time instants, and the task-batch association that determines batch sizes. The joint optimisation problem is complex due to multiple coupled variables as mentioned and numerous constraints including heterogeneous tasks arrivals and deadlines, the causality requirements on multi-task execution, and limited radio resources. Underpinning the problem is a basic tradeoff between the size of batch and waiting time for tasks in the batch to be uploaded and executed. Our approach of solving the formulated mixed-integer problem is to transform it into a convex problem via integer relaxation method and $\ell_0$-norm approximation. This results in an efficient alternating optimization algorithm for finding a close-to-optimal solution. In addition, we also design the optimal algorithm from leveraging spectrum holes, which are caused by fixed bandwidth allocation to devices and their asynchronized multi-batch task execution, to admit unscheduled tasks so as to further enhance throughput. Simulation results demonstrate that the proposed framework of joint batching and resource allocation can substantially enhance the throughput of multiuser edge-AI as opposed to a number of simpler benchmarking schemes, e.g., equal-bandwidth allocation, greedy batching and single-batch execution.
Signal Processing,Optimization and Control
What problem does this paper attempt to address?
### Problems Addressed by the Paper This paper aims to address the joint batching and scheduling problem in Edge AI systems within the context of sixth-generation networks (6G). Specifically, the goal is to maximize system throughput by optimizing batching and task scheduling under conditions of asynchronous task arrivals and tasks with different deadlines. ### Main Research Content 1. **Background and Challenges**: - Edge AI in 6G networks can provide inference services, enhancing the capabilities of mobile devices and extending battery life. - Batching can improve the computational throughput of edge servers by combining multiple tasks into a single batch, reducing memory access frequency. - In multi-user Edge AI systems, end-to-end latency depends not only on computation but also on communication (i.e., multi-user task uploads to the multiple access channel). 2. **Problem Definition**: - The study investigates joint batching and task scheduling to maximize throughput (i.e., the number of completed tasks), considering asynchronous task arrivals and different deadlines. - The design objective is to optimize the number of batches, the start time of each batch, and the association between tasks and batches to determine batch size. - The joint optimization problem is highly complex due to multiple coupled variables and various constraints, including asynchronous task arrivals, deadline requirements, causal relationships in multi-task execution, and limited wireless resources. 3. **Solution**: - The mixed-integer nonlinear programming problem is transformed into a convex problem using integer relaxation methods and ℓ0 norm approximation. - An efficient alternating optimization algorithm is proposed to find near-optimal solutions. - The algorithm alternates between solving two subproblems: optimal task-to-batch association and optimal batch start time selection. - A spectrum hole allocation algorithm is utilized to further improve throughput, allowing unscheduled tasks to enter the system. ### Conclusion The paper proposes a joint batching and scheduling framework that effectively enhances the throughput of multi-user Edge AI systems under conditions of asynchronous task arrivals and different deadlines. The proposed algorithm addresses the complex optimization problem through integer relaxation and alternating optimization, and its significant performance improvement is validated through simulations.