Abstract:In this paper, a two-stage intelligent scheduler is proposed to minimize the packet-level delay jitter while guaranteeing delay bound. Firstly, Lyapunov technology is employed to transform the delay-violation constraint into a sequential slot-level queue stability problem. Secondly, a hierarchical scheme is proposed to solve the resource allocation between multiple base stations and users, where the multi-agent reinforcement learning (MARL) gives the user priority and the number of scheduled packets, while the underlying scheduler allocates the resource. Our proposed scheme achieves lower delay jitter and delay violation rate than the Round-Robin Earliest Deadline First algorithm and MARL with delay violation penalty.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to minimize the packet - level delay jitter while ensuring that strict delay constraints are met in the Ultra - Reliable Low - Latency Communication (URLLC) scenario. Specifically, the paper focuses on designing a joint resource allocation and scheduling algorithm in a multi - cell, multi - user environment to ensure that each user's service packets are transmitted before their deadlines and to reduce delay jitter as much as possible.
### Problem Background
With the development of 5G technology, the URLLC scenario has received increasing attention. Especially in applications such as industrial process control, the requirements for end - to - end (E2E) delay, reliability, and delay jitter are very strict. According to the 3GPP standard, in closed - loop control applications, the E2E delay should be controlled within 2 milliseconds, the reliability needs to reach \(1 - 10^{-6}\), and the delay jitter should be controlled within 1 microsecond. These requirements pose extremely high performance challenges to the communication system.
### Shortcomings of Existing Research
Existing research mainly focuses on the resource allocation problem, but usually ignores the impact of delay jitter. In addition, most studies regard delay violation as an optimization target, aiming to minimize the probability of delay violation, rather than taking it as a strict constraint. This approach may introduce deviations in practical applications, resulting in the inability to truly meet strict delay constraints.
### Main Contributions of the Paper
To solve the above problems, the paper makes the following contributions:
1. **Research on multi - cell multi - user scenarios under low - delay jitter and probabilistic delay constraints**: For the first time, the paper proposes a joint model and data - driven two - stage intelligent scheduling algorithm, aiming to minimize jitter and ensure probabilistic delay constraints.
2. **First stage: Application of Lyapunov technology**: Through Lyapunov technology, the long - term delay violation probability constraint is converted into a single - time - slot virtual queue stability condition, thereby ensuring the delay constraint.
3. **Second stage: Hierarchical multi - agent reinforcement learning (MARL) algorithm**: To solve the problem of an overly large action space when directly allocating resources, a hierarchical algorithm is introduced. The multi - agent RL (MARL) algorithm is used to assign priorities and the number of scheduled packets for each user, and then the underlying scheduler allocates resources according to the priorities.
### Summary
The two - stage intelligent scheduling algorithm (LGQP - IPS) proposed in the paper performs well in the simulation results. Compared with the traditional Round - Robin EDF algorithm and the MARL algorithm with delay violation penalty, it can effectively reduce the delay jitter under medium - and low - traffic loads and reduce the delay violation rate under high - traffic loads. This shows that the algorithm has significant advantages in handling delay - sensitive tasks.
### Formula Display
- **SINR formula**:
\[
\gamma_t^{u,f}=\frac{\zeta_t^{u,f}\|h_t^{H,f,\bar{b}_u,u}w_t^{f,\bar{b}_u,u}\|^2}{\sum_{v\in U - u}\zeta_t^{v,f}\|h_t^{H,f,\bar{b}_v,u}w_t^{f,\bar{b}_v,v}\|^2+\sigma_t^{u,f^2}}
\]
- **Achievable Rate Formula**:
\[
\psi_t^u=\sum_{f\in F}\log_2(1 + \gamma_t^{u,f})-Q^{-1}(\epsilon_u)\sqrt{\sum_{f\in F}V_t^{u,f}}
\]
where \(Q^{-1}(\cdot)\) is the inverse function of the Gaussian Q - function, and \(V_t^{u,f}=[\log_2(e)]^2(1-(1 + \gamma_t^{u,f})^{-2})\).
- **System Average Delay Jitter Formula**:
\[
f(\{\zeta_t^{u,f}\})=\frac{1}{U}\sum_{u\in U}\sqrt{\frac{1}{|C_u|}\sum_{c\in C_u}(\zeta_{t + 1}^{u,f}-\zeta_t^{u,f})^2}
\]