A Reinforcement Learning Based Joint Spectrum Allocation and Power Control Algorithm for D2D Communication Underlaying Cellular Networks

Wentai Chen,Jun Zheng
DOI: https://doi.org/10.1007/978-3-030-22968-9_13
2019-01-01
Abstract:This paper studies the spectrum allocation and power control (SA-PC) problem in device-to-device (D2D) communication underlaying a cellular network. A distributed multi-agent reinforcement learning (MARL) based joint SA-PC algorithm is proposed for performing spectrum allocation and power control for each D2D user in the network. The proposed algorithm uses Q learning, a typical form of reinforcement learning (RL), to select the optimal resource block (RB) and power level for each D2D user. In the Q-learning algorithm, each D2D user is treated as an individual agent and maintains a single-state Q table. Each agent selects an RB and a power level according to its Q table in the learning process. Simulation results show that the proposed Q-learning based joint SA-PC algorithm can achieve good throughput performance.
What problem does this paper attempt to address?