Joint Mode Selection and Power Adaptation for D2D Communication with Reinforcement Learning.

Yiming Qiu,Zelin Ji,Yonghao Zhu,Guanghao Meng,Gang Xie
DOI: https://doi.org/10.1109/iswcs.2018.8491238
2018-01-01
Abstract:Device-to-Device (D2D) communication is widely used to enhance the performance of cellular networks. For each D2D link, users may choose to communicate with each other under different transmission modes as well as power levels in order to guarantee their quality of service (QoS). The main challenges arise from a strong need for distributed optimization algorithms that operate in the absence of precise network knowledge. In this paper, we propose a joint mode selection and power adaptation approach using conjecture based multi-agent Q-Iearning algorithm. We consider a realistic scenario where the base station (BS) is provided with strictly limited channel knowledge, whereas D2D users as well as cellular users have only private SINR information. To update Q-value based on private and incomplete information, a long-term observation-based Q-value updating algorithm is proposed. Experiments are implemented to verify the convergence performance of our algorithm. Numerical results show that our proposed algorithm performs better than other state-of-the-art algorithms and achieves near-optimal energy efficiency.
What problem does this paper attempt to address?