Abstract:In this paper, we study joint batching and (task) scheduling to maximise the throughput (i.e., the number of completed tasks) under the practical assumptions of heterogeneous task arrivals and deadlines. The design aims to optimise the number of batches, their starting time instants, and the task-batch association that determines batch sizes. The joint optimisation problem is complex due to multiple coupled variables as mentioned and numerous constraints including heterogeneous tasks arrivals and deadlines, the causality requirements on multi-task execution, and limited radio resources. Underpinning the problem is a basic tradeoff between the size of batch and waiting time for tasks in the batch to be uploaded and executed. Our approach of solving the formulated mixed-integer problem is to transform it into a convex problem via integer relaxation method and $\ell_0$-norm approximation. This results in an efficient alternating optimization algorithm for finding a close-to-optimal solution. In addition, we also design the optimal algorithm from leveraging spectrum holes, which are caused by fixed bandwidth allocation to devices and their asynchronized multi-batch task execution, to admit unscheduled tasks so as to further enhance throughput. Simulation results demonstrate that the proposed framework of joint batching and resource allocation can substantially enhance the throughput of multiuser edge-AI as opposed to a number of simpler benchmarking schemes, e.g., equal-bandwidth allocation, greedy batching and single-batch execution.

What problem does this paper attempt to address?

### Problems Addressed by the Paper This paper aims to address the joint batching and scheduling problem in Edge AI systems within the context of sixth-generation networks (6G). Specifically, the goal is to maximize system throughput by optimizing batching and task scheduling under conditions of asynchronous task arrivals and tasks with different deadlines. ### Main Research Content 1. **Background and Challenges**: - Edge AI in 6G networks can provide inference services, enhancing the capabilities of mobile devices and extending battery life. - Batching can improve the computational throughput of edge servers by combining multiple tasks into a single batch, reducing memory access frequency. - In multi-user Edge AI systems, end-to-end latency depends not only on computation but also on communication (i.e., multi-user task uploads to the multiple access channel). 2. **Problem Definition**: - The study investigates joint batching and task scheduling to maximize throughput (i.e., the number of completed tasks), considering asynchronous task arrivals and different deadlines. - The design objective is to optimize the number of batches, the start time of each batch, and the association between tasks and batches to determine batch size. - The joint optimization problem is highly complex due to multiple coupled variables and various constraints, including asynchronous task arrivals, deadline requirements, causal relationships in multi-task execution, and limited wireless resources. 3. **Solution**: - The mixed-integer nonlinear programming problem is transformed into a convex problem using integer relaxation methods and ℓ0 norm approximation. - An efficient alternating optimization algorithm is proposed to find near-optimal solutions. - The algorithm alternates between solving two subproblems: optimal task-to-batch association and optimal batch start time selection. - A spectrum hole allocation algorithm is utilized to further improve throughput, allowing unscheduled tasks to enter the system. ### Conclusion The paper proposes a joint batching and scheduling framework that effectively enhances the throughput of multi-user Edge AI systems under conditions of asynchronous task arrivals and different deadlines. The proposed algorithm addresses the complex optimization problem through integer relaxation and alternating optimization, and its significant performance improvement is validated through simulations.

Joint Batching and Scheduling for High-Throughput Multiuser Edge AI with Asynchronous Task Arrivals

AI-oriented Workload Allocation for Cloud-Edge Computing.

Joint User Scheduling and Resource Allocation for Millimeter Wave Systems Relying on Adaptive-Resolution ADCs

Joint Device Scheduling and Resource Allocation for ISCC-Based Multi-View-Multi-Task Inference

Resource Allocation for Multiuser Edge Inference with Batching and Early Exiting (Extended Version)

Joint Task Offloading and Resource Allocation for Quality-Aware Edge-Assisted Machine Learning Task Inference

Dynamic Batching and Early-Exiting for Accurate and Timely Edge Inference

Simulation-based joint user assignment and edge resource allocation optimization for hybrid tasks in vehicular edge computing

Joint Task Offloading and Resource Allocation in Heterogeneous Edge Environments

Joint scheduling and offloading of computational tasks with time dependency under edge computing networks

On-demand Edge Inference Scheduling with Accuracy and Deadline Guarantee.

Leveraging Joint Allocation of Multidimensional Resources for Distributed Task Assignment

Joint Job Offloading and Resource Allocation for Distributed Deep Learning in Edge Computing.

Digital Twin-Driven Collaborative Scheduling for Heterogeneous Task and Edge-End Resource Via Multi-Agent Deep Reinforcement Learning.

Task allocation algorithm and optimization model on edge collaboration

Algorithms for the joint multitasking scheduling and common due date assignment problem

Online Approximation Scheme for Scheduling Heterogeneous Utility Jobs in Edge Computing

Collaborative Service Placement, Task Scheduling, and Resource Allocation for Task Offloading with Edge-Cloud Cooperation

Optimizing Task-Specific Timeliness With Edge-Assisted Scheduling for Status Update

Multiuser Computation Offloading and Resource Allocation for Cloud–Edge Heterogeneous Network

Joint Service Deployment and Task Scheduling for Satellite Edge Computing: A Two-Timescale Hierarchical Approach