Online Trajectory Optimization for the UAV-Enabled Base Station Multicasting System Based on Reinforcement Learning

Zhang Guangchi,Yan Yulin,Cui Miao,Chen Wei,Zhang Jing
DOI: https://doi.org/10.11999/JEIT210429
2022-01-01
Abstract:In order to deal with the communication delay problem in an Unmanned Aerial Vehicle ( UAV) enabled Base Station (BS) multicasting communication system, the online trajectory design for the UAV BS is investigated. A UAV BS is dispatched to disseminate common information to multiple ground users simultaneously in this system, where the locations of the ground users are random in each multicasting communication task. To ensure that the ground users can receive the complete multicasting information and considering the limited energy of the UAV, this paper focuses on minimizing the average duration for the UAV BS to complete the multicasting task. First, the considered problem is casted as a Markov Decision Process (MDP), and then the communication delay is introduced into the action value function. Finally, an online trajectory optimization algorithm based on the Q- Learning algorithm is proposed to minimize the average duration for the UAV BS to complete the multicasting task. Simulation results show that the proposed algorithm can effectively optimize the trajectory of the UAV BS for its multicasting task in an online manner and can effectively reduce the duration of the multicast task, as compared to other benchmark schemes.
What problem does this paper attempt to address?