Deep Reinforcement Learning-Based Scheduling for NR-U/WiGig Coexistence in Unlicensed Mmwave Bands

Xiaowen Ye,Qian Zhou,Liqun Fu
DOI: https://doi.org/10.1109/twc.2023.3275945
IF: 10.4
2024-01-01
IEEE Transactions on Wireless Communications
Abstract:This paper investigates the coexistence of the New Radio-based access to Unlicensed spectrum (NR-U) network and the Wireless Gigabit (WiGig) network in unlicensed millimeter-wave (mmWave) bands. To enable the NR-U network to achieve equitable and harmonious spectrum sharing with WiGig systems, we develop two new classes of user equipment (UE) scheduling schemes by exploiting the deep reinforcement learning (DRL) technique. Specifically, we first propose the distributed deep reinforcement learning scheduling (DeepDS) scheme, wherein multiple deep neural networks (DNNs) are used to make decisions for different panels in an independent fashion. Thereafter, to reduce the computational cost of adopting multiple DNNs, we design the centralized deep reinforcement learning scheduling (DeepCS) scheme that introduces the shared DNN framework to perform decisions for all panels in parallel at one time. The objective of both DeepDS and DeepCS is to maximize the total data rate of the NR-U network with as little interference to WiGig systems as possible, while satisfying the quality of service (QoS) requirement for each UE. We first formulate this problem into the constrained Markov decision process framework. To address the multi-constraint issue, we put forth a new DRL algorithm that incorporates the Lagrangian primal-dual optimization into the deep Q-network framework, referred to as adaptive multi-constraint deep Q-network (AMC-DQN). With AMC-DQN, both DeepDS and DeepCS can achieve their goals even without acquiring prior operations about the WiGig network. Simulation results show that compared with the state-of-the-art omniLBT and dirLBT, both DeepDS and DeepCS yield significant performance benefits in terms of the total network data rate. We also demonstrate the ability of DeepDS and DeepCS to satisfy the QoS requirements of different UEs and their robustness against various simulation setups. Furthermore, compared with DeepDS, DeepCS can save a large amount of computational cost although at the expense of a slightly lower data rate.
What problem does this paper attempt to address?