The Transmission Scheme for UAV MIMO System Based on Reinforcement Learning

Wenyi Zhang,Chen Liu,Ge Wang,Yunchao Song
DOI: https://doi.org/10.1109/wcsp55476.2022.10039104
2022-01-01
Abstract:Unmanned aerial vehicle (UAV)-assisted wireless communication is considered to be an important technique in 6G. However, the joint optimization of trajectory and beamforming for UAV is a challenging problem. In this paper, we merge an improved reinforcement learning scheme in a UAV-enabled MIMO communication. Our goal is to maximize the sum of received signal energy within a finite time horizon through trajectory design and efficient beamforming of the UAV base station (UAV-BS). Firstly, we model the communication process as a Markov decision process (MDP). Secondly, we propose a general framework of Q-learning in combination with multi-armed bandit (MAB) to solve the problem. On one hand, the Q-learning algorithm is used to optimize the UAV trajectory by learning optimal actions and recording the learned strategies. On the other hand, the MAB is used to find the beamforming vector at the UAV. Simulation results are provided to show that the proposed MAB-Qlearning scheme can effectively optimize the UAV wireless transmission scheme with a mobile ground user.
What problem does this paper attempt to address?