Abstract:As an important part of the fifth generation (5G) mobile networks, unmanned aerial vehicles (UAVs) have been applied in various communication scenarios due to their high operability and low cost. In this paper, we investigate a multi-UAV communication system with moving users and consider the co-channel interference caused by the transmissions of all other UAVs. To ensure the fairness, we maximize the minimum average user rate during the observed time by jointly optimizing UAVs' trajectories, transmission power, and user association. Considering that UAVs can cover a large area for communications, UAVs do not need to move as soon as the users move. Therefore, a two-timescale structure is proposed for the considered scenario, where the UAVs' trajectories are optimized based on the channel state information (CSI) in a long timescale, while the transmission power and the user association are optimized based on the instantaneous CSI in a short timescale. To effectively tackle this challenging non-convex problem with both discrete and continuous variables, we propose a joint neural network (NN) design, where a deep reinforcement learning based Pointer Network named advantage pointer-critic (APC) is applied to optimize discrete variables and a deep-unfolding NN is used to optimize the continuous variables. Specifically, we first formulate a Markov decision process to model the user association, and then employ the APC network trained by the advantage actor-critic algorithm to address it. The APC network consists of a Pointer Network and a Multilayer Perceptron. As for the deep-unfolding NN, we first develop a block coordinate descent based algorithm to optimize the UAVs' trajectories and transmission power, and then unfold the algorithm into a layer-wise NN with introduced trainable parameters. These two networks are jointly trained in an unsupervised fashion. Simulation results validate that the proposed joint NN significantly outperforms the optimization algorithm with much lower complexity, and achieves good performances on scalability and generalization ability.

Distributed Trajectory Design for Cooperative Internet of UAVs Using Deep Reinforcement Learning

Cooperative Internet of UAVs: Distributed Trajectory Design by Multi-Agent Deep Reinforcement Learning.

Trajectory Design for UAV-Based Internet of Things Data Collection: A Deep Reinforcement Learning Approach

Reinforcement Learning for Decentralized Trajectory Design in Cellular UAV Networks with Sense-and-Send Protocol.

Deep Reinforcement Learning Based Distributed 3D UAV Trajectory Design

Trajectory Design and Resource Allocation for Multi-UAV Networks: Deep Reinforcement Learning Approaches

Cellular UAV-to-Device Communications: Trajectory Design and Mode Selection by Multi-agent Deep Reinforcement Learning

Joint Neural Network for Trajectory and Communication Design in Multi-DAV Systems

Reinforcement Learning for a Cellular Internet of UAVs: Protocol Design, Trajectory Control, and Resource Management

Three-Dimensional Trajectory Design for Multi-User MISO UAV Communications: A Deep Reinforcement Learning Approach

Three-Dimension Trajectory Design for Multi-UAV Wireless Network With Deep Reinforcement Learning

Joint Resource Allocation and Trajectory Design for Multi-UAV Systems With Moving Users: Pointer Network and Unfolding

Multi-UAV Trajectory Design and Power Control Based on Deep Reinforcement Learning.

Mobility-Aware Trajectory Design For Aerial Base Station Using Deep Reinforcement Learning

Model-aided Deep Reinforcement Learning for Sample-efficient UAV Trajectory Design in IoT Networks

Multi-agent Deep Reinforcement Learning-based Trajectory Design for UAV-aided Edge Computing System.

Deep Reinforcement Learning for Joint Trajectory Planning, Transmission Scheduling, and Access Control in UAV-Assisted Wireless Sensor Networks

Federated deep reinforcement learning based trajectory design for UAV-assisted networks with mobile ground devices

Research on the UAV-aided Data Collection and Trajectory Design Based on the Deep Reinforcement Learning

Trajectory Design and Access Control for Air–Ground Coordinated Communications System With Multiagent Deep Reinforcement Learning

Deep Reinforcement Learning Based Trajectory Design and Resource Allocation for UAV-Assisted Communications