Power Allocation for Device-to-Multi-Device Enabled HetNets - A Deep Reinforcement Learning Approach.

Yang Xiao,Jiawei Wu,Jun Liu
DOI: https://doi.org/10.1109/globecom46510.2021.9685145
2021-01-01
Abstract:Device-to-device ($D$ 2D) communication exploits the geographical proximity by allowing neighboring devices to di-rectly communicate with each other, which becomes one of the most promising technologies to improve the spectral and energy efficiency for 5G and beyond communication systems. To further improve the spectral efficiency and generalize ap-plication scenarios, the emerging device-to-multi-device (D2M $D$) communication enables the D2D transmitter to communicate with multiple receivers simultaneously. In this paper, we consider a heterogeneous network (HetNet) where multiple D2M $D$ clusters coexist with the base station (BS) and cellular users (CUs). All D2MD clusters share the same downlink channel as the cellular network, which potentially leads to severe co-channel interfer-ence. To solve this problem, we leverage the deep reinforcement learning (DRL) and propose the deep reinforcement power allocation (DRPA) algorithm to dynamically allocate power for D2MD communication in HetNets. In addition, we apply the centralized training distributed execution (CTDE) technique to accelerate the training process and improve the robustness of DRPA. Simulation results demonstrate that the DRPA algorithm outperforms baseline methods in terms of maximizing the average sum-rate. In addition, the DRPA algorithm is robust to the changes of network environment while achieving near-optimal performance.
What problem does this paper attempt to address?