Joint Trajectory and Passive Beamforming Design for Intelligent Reflecting Surface-Aided UAV Communications: A Deep Reinforcement Learning Approach

Liang Wang,Kezhi Wang,Cunhua Pan,Nauman Aslam
DOI: https://doi.org/10.48550/arXiv.2007.08380
2022-08-30
Abstract:In this paper, the intelligent reflecting surface (IRS)-aided unmanned aerial vehicle (UAV) communication system is studied, where the UAV is deployed to serve the user equipment (UE) with the assistance of multiple IRSs mounted on several buildings to enhance the communication quality between UAV and UE. We aim to maximize the energy efficiency of the system, including the data rate of UE and the energy consumption of UAV via jointly optimizing the UAV's trajectory and the phase shifts of reflecting elements of IRS, when the UE moves and the selection of IRSs is considered for the energy saving purpose. Since the system is complex and the environment is dynamic, it is challenging to derive low-complexity algorithms by using conventional optimization methods. To address this issue, we first propose a deep Q-network (DQN)-based algorithm by discretizing the trajectory, which has the advantage of training time. Furthermore, we propose a deep deterministic policy gradient (DDPG)-based algorithm to tackle the case with continuous trajectory for achieving better performance. The experimental results show that the proposed algorithms achieve considerable performance compared to other traditional solutions.
Signal Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is in the unmanned aerial vehicle (UAV) communication system assisted by the intelligent reflecting surface (IRS), how to jointly optimize the flight trajectory of the UAV and the phase shift of the IRS reflection unit to maximize the energy efficiency of the system, while considering user equipment (UE) movement and IRS selection to achieve the purpose of energy conservation. Specifically, the research objective is to increase the data transmission rate of the UE while reducing the energy consumption of the UAV. The main challenges in the paper lie in the complexity of the system and the dynamic changes of the environment. It is difficult to design a low - complexity algorithm using traditional optimization methods. For this reason, the authors propose two methods based on deep reinforcement learning (DRL): 1. **Algorithm based on deep Q - network (DQN)**: By discretizing the flight trajectory of the UAV, a DQN algorithm is proposed. This algorithm has an advantage in training time, although it may lead to a slight decrease in performance. 2. **Algorithm based on deep deterministic policy gradient (DDPG)**: In order to obtain better performance in the continuous action space, the authors also propose a DDPG algorithm for optimizing the flight trajectory of the UAV. The experimental results show that, compared with traditional solutions, the proposed algorithms can significantly improve the system performance.