What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to minimize the packet - level delay jitter while ensuring that strict delay constraints are met in the Ultra - Reliable Low - Latency Communication (URLLC) scenario. Specifically, the paper focuses on designing a joint resource allocation and scheduling algorithm in a multi - cell, multi - user environment to ensure that each user's service packets are transmitted before their deadlines and to reduce delay jitter as much as possible. ### Problem Background With the development of 5G technology, the URLLC scenario has received increasing attention. Especially in applications such as industrial process control, the requirements for end - to - end (E2E) delay, reliability, and delay jitter are very strict. According to the 3GPP standard, in closed - loop control applications, the E2E delay should be controlled within 2 milliseconds, the reliability needs to reach \(1 - 10^{-6}\), and the delay jitter should be controlled within 1 microsecond. These requirements pose extremely high performance challenges to the communication system. ### Shortcomings of Existing Research Existing research mainly focuses on the resource allocation problem, but usually ignores the impact of delay jitter. In addition, most studies regard delay violation as an optimization target, aiming to minimize the probability of delay violation, rather than taking it as a strict constraint. This approach may introduce deviations in practical applications, resulting in the inability to truly meet strict delay constraints. ### Main Contributions of the Paper To solve the above problems, the paper makes the following contributions: 1. **Research on multi - cell multi - user scenarios under low - delay jitter and probabilistic delay constraints**: For the first time, the paper proposes a joint model and data - driven two - stage intelligent scheduling algorithm, aiming to minimize jitter and ensure probabilistic delay constraints. 2. **First stage: Application of Lyapunov technology**: Through Lyapunov technology, the long - term delay violation probability constraint is converted into a single - time - slot virtual queue stability condition, thereby ensuring the delay constraint. 3. **Second stage: Hierarchical multi - agent reinforcement learning (MARL) algorithm**: To solve the problem of an overly large action space when directly allocating resources, a hierarchical algorithm is introduced. The multi - agent RL (MARL) algorithm is used to assign priorities and the number of scheduled packets for each user, and then the underlying scheduler allocates resources according to the priorities. ### Summary The two - stage intelligent scheduling algorithm (LGQP - IPS) proposed in the paper performs well in the simulation results. Compared with the traditional Round - Robin EDF algorithm and the MARL algorithm with delay violation penalty, it can effectively reduce the delay jitter under medium - and low - traffic loads and reduce the delay violation rate under high - traffic loads. This shows that the algorithm has significant advantages in handling delay - sensitive tasks. ### Formula Display - **SINR formula**: \[ \gamma_t^{u,f}=\frac{\zeta_t^{u,f}\|h_t^{H,f,\bar{b}_u,u}w_t^{f,\bar{b}_u,u}\|^2}{\sum_{v\in U - u}\zeta_t^{v,f}\|h_t^{H,f,\bar{b}_v,u}w_t^{f,\bar{b}_v,v}\|^2+\sigma_t^{u,f^2}} \] - **Achievable Rate Formula**: \[ \psi_t^u=\sum_{f\in F}\log_2(1 + \gamma_t^{u,f})-Q^{-1}(\epsilon_u)\sqrt{\sum_{f\in F}V_t^{u,f}} \] where \(Q^{-1}(\cdot)\) is the inverse function of the Gaussian Q - function, and \(V_t^{u,f}=[\log_2(e)]^2(1-(1 + \gamma_t^{u,f})^{-2})\). - **System Average Delay Jitter Formula**: \[ f(\{\zeta_t^{u,f}\})=\frac{1}{U}\sum_{u\in U}\sqrt{\frac{1}{|C_u|}\sum_{c\in C_u}(\zeta_{t + 1}^{u,f}-\zeta_t^{u,f})^2} \]

Lyapunov-guided Multi-Agent Reinforcement Learning for Delay-Sensitive Wireless Scheduling

Energy Efficient Joint Resource Scheduling for Delay-Aware Traffic in Cloud-RAN.

Multi-agent Deep Reinforcement Learning for Cross-Layer Scheduling in Mobile Ad-Hoc Networks

Buffer-Aware Wireless Scheduling Based On Deep Reinforcement Learning

Delay-Oriented Scheduling in 5G Downlink Wireless Networks Based on Reinforcement Learning With Partial Observations

Dynamic flexible scheduling with transportation constraints by multi-agent reinforcement learning

Joint Queue-Aware and Channel-Aware Delay Optimal Scheduling of Arbitrarily Bursty Traffic over Multi-State Time-Varying Channels.

Distributed Delay-Aware Resource Control and Scheduling in Multihop Wireless Networks

Jamsa: A Utility Optimal Contextual Online Learning Framework for Anti-Jamming Wireless Scheduling under Reactive Jamming Attack

Delay-Constrained Optimal Link Scheduling in Wireless Sensor Networks

Wireless Resource Scheduling in Virtualized Radio Access Networks Using Stochastic Learning.

Energy-Aware MPTCP Scheduling in Heterogeneous Wireless Networks Using Multi-Agent Deep Reinforcement Learning Techniques

The Delay-Power Tradeoff of Low Complexity Cross-Layer Scheduling: when Lyapunov Meets Markov

Online Multi-User Scheduling for XR Transmissions with Hard-Latency Constraint: Performance Analysis and Practical Design

A Multi-Policy Deep Reinforcement Learning Approach for Multi-Objective Joint Routing and Scheduling in Deterministic Networks

Multi-Agent Reinforcement Learning for Multi-Cell Spectrum and Power Allocation

A Dynamic Resource Scheduling Algorithm Based on Traffic Prediction for Coexistence of eMBB and Random Arrival URLLC

Learning to Schedule Communication in Multi-agent Reinforcement Learning

Delay-Aware Two-Time-Scale Scheduling for mmWave Systems with Mobility and Environment Knowledge

Multi-Objective Order Scheduling via Reinforcement Learning

Scheduling Approaches for Joint Optimization of Age and Delay in Industrial Wireless Networks