Three-Dimension Trajectory Design for Multi-UAV Wireless Network With Deep Reinforcement Learning

Wenqi Zhang,Qiang Wang,Xiao Liu,Yuanwei Liu,Yue Chen

DOI: https://doi.org/10.1109/tvt.2020.3047800

IF: 6.8

2021-01-01

IEEE Transactions on Vehicular Technology

Abstract:The effective trajectory design of multiple unmanned aerial vehicles (UAVs) is investigated for improving the capacity of the communication system. The aim is for maximizing real-time downlink capacity under the coverage constraint by reaping the mobility benefits of UAVs. The problem of three-dimension (3D) dynamic movement of UAVs under coverage constraint is formulated as a Constrained Markov Decision Process (CMDP) problem, while a constrained Deep Q-Network (cDQN) algorithm is proposed for solving the formulated problem. In the proposed cDQN model, each UAV acts as an agent to explore and learn its 3D deploying policy. The aim of the proposed cDQN model is for obtaining the maximum capacity while attempting to guarantee that all ground terminals (GTs) are covered. In order to satisfy the coverage constraint, a primal-dual method is adopted for training primal variable and dual variable (lagrangian multiplier) in turn. Furthermore, in an effort to reduce the action space of the cDQN algorithm, prior information is utilized for eliminating the invalid actions by the action filter. Experiment results demonstrate that the cDQN algorithm is capable of converging after some training steps. Additionally, the UAVs are capable of adapting the movement of GTs under the coverage constraint according to the 3D deploying policy derived from the proposed cDQN algorithm.

telecommunications,engineering, electrical & electronic,transportation science & technology

What problem does this paper attempt to address?

This paper aims to solve the problem of effective trajectory design of multiple unmanned aerial vehicles (UAVs) in wireless networks to improve the capacity of communication systems. Specifically, the goal of the paper is to maximize the real - time downlink capacity under the coverage constraint, achieving this goal by taking advantage of the mobility of UAVs. In the study, the problem of three - dimensional dynamically moving UAVs under the coverage constraint is formulated as a constrained Markov decision process (CMDP) problem, and a constrained deep Q - network (cDQN) algorithm is proposed to solve this problem. In the proposed cDQN model, each UAV acts as an agent to explore and learn its three - dimensional deployment strategy. The goal of this model is to obtain the maximum capacity while attempting to ensure that all ground terminals (GTs) are covered. To meet the coverage constraint, a primal - dual method is used to alternately train the primal variables and the dual variables (Lagrangian multipliers). In addition, to reduce the action space of the cDQN algorithm, prior information is used to eliminate invalid actions through an action filter. Experimental results show that the cDQN algorithm can converge after some training steps, and UAVs can adapt to the movement of GTs under the coverage constraint according to the three - dimensional deployment strategy derived from the proposed cDQN algorithm.

Three-Dimension Trajectory Design for Multi-UAV Wireless Network With Deep Reinforcement Learning

Four-Dimensional Trajectory Generation for UAVs Based on Multi-Agent Q Learning

Joint Neural Network for Trajectory and Communication Design in Multi-DAV Systems

Joint Resource Allocation and Trajectory Design for Multi-UAV Systems With Moving Users: Pointer Network and Unfolding

Three-Dimensional Trajectory Design for Multi-User MISO UAV Communications: A Deep Reinforcement Learning Approach

Collaborative Reinforcement Learning Based Unmanned Aerial Vehicle (UAV) Trajectory Design for 3D UAV Tracking

Deep Reinforcement Learning-Based 3D Trajectory Planning for Cellular Connected UAV

Three-dimensional deep reinforcement learning for trajectory and resource optimization in UAV communication systems

Federated deep reinforcement learning based trajectory design for UAV-assisted networks with mobile ground devices

Deep Reinforcement Learning Based 3D UAV Trajectory Design and Frequency Band Allocation.

Trajectory Design for UAV-Based Internet of Things Data Collection: A Deep Reinforcement Learning Approach

Low-Complexity Joint Resource Allocation and Trajectory Design for UAV-Aided Relay Networks With the Segmented Ray-Tracing Channel Model

Reinforcement Learning in Multiple-UAV Networks: Deployment and Movement Design

Mobility-Aware Trajectory Design For Aerial Base Station Using Deep Reinforcement Learning

Three-Dimensional Trajectory and Resource Allocation Optimization in Multi-Unmanned Aerial Vehicle Multicast System: A Multi-Agent Reinforcement Learning Method

3D-Trajectory and Phase-Shift Design for RIS-Assisted UAV Systems Using Deep Reinforcement Learning

Multi-UAV Trajectory Planning for Energy-Efficient Content Coverage: A Decentralized Learning-Based Approach

UAV Swarm Deployment and Trajectory for 3D Area Coverage via Reinforcement Learning

Connectivity-Aware 3D UAV Path Design With Deep Reinforcement Learning

Trajectory Design and Access Control for Air–Ground Coordinated Communications System With Multiagent Deep Reinforcement Learning