Abstract:Unmanned Aerial Vehicles (UAVs) can be deployed as aerial wireless base stations which dynamically cover the wireless communication networks for Ground Users (GUs). The most challenging problem is how to control multi-UAVs to achieve on-demand coverage of wireless communication networks while maintaining connectivity among them. In this paper, the cooperative trajectory optimization of UAVs is studied to maximize the communication efficiency in the dynamic deployment of UAVs for emergency communication scenarios. We transform the problem into a Markov game problem and propose a distributed trajectory optimization algorithm, Double-Stream Attention multi-agent Actor-Critic (DSAAC), based on Multi-Agent Deep Reinforcement Learning (MADRL). The throughput, safety distance, and power consumption of UAVs are comprehensively taken into account for designing a practical reward function. For complex emergency communication scenarios, we design a double data stream network structure that provides a capacity for the Actor network to process state changes. Thus, UAVs can sense the movement trends of the GUs as well as other UAVs. To establish effective cooperation strategies for UAVs, we develop a hierarchical multi-head attention encoder in the Critic network. This encoder can reduce the redundant information through the attention mechanism, which resolves the problem of the curse of dimensionality as the number of both UAVs and GUs increases. We construct a simulation environment for emergency networks with multi-UAVs and compare the effects of the different numbers of GUs and UAVs on algorithms. The DSAAC algorithm improves communication efficiency by 56.7%, throughput by 71.2%, energy saving by 19.8%, and reduces the number of crashes by 57.7%.

Trajectory Design for Multi-UAV-Assisted Data Collection: A Multi-agent Deep Reinforcement Learning Approach

Four-Dimensional Trajectory Generation for UAVs Based on Multi-Agent Q Learning

Federated deep reinforcement learning based trajectory design for UAV-assisted networks with mobile ground devices

Three-Dimensional Trajectory Design for Multi-User MISO UAV Communications: A Deep Reinforcement Learning Approach

Joint UAV trajectory and communication design with heterogeneous multi-agent reinforcement learning

Cellular UAV-to-Device Communications: Trajectory Design and Mode Selection by Multi-agent Deep Reinforcement Learning

Trajectory Design for UAV-Based Internet of Things Data Collection: A Deep Reinforcement Learning Approach

Energy-Efficient Multi-UAVs Cooperative Trajectory Optimization for Communication Coverage: An MADRL Approach

3D-Trajectory and Phase-Shift Design for RIS-Assisted UAV Systems Using Deep Reinforcement Learning

Three-Dimension Trajectory Design for Multi-UAV Wireless Network With Deep Reinforcement Learning

Three-dimensional deep reinforcement learning for trajectory and resource optimization in UAV communication systems

Trajectory Design and Access Control for Air–Ground Coordinated Communications System With Multiagent Deep Reinforcement Learning

Deep Reinforcement Learning for Joint Trajectory Planning, Transmission Scheduling, and Access Control in UAV-Assisted Wireless Sensor Networks

Research on the UAV-aided Data Collection and Trajectory Design Based on the Deep Reinforcement Learning

Trajectory Planning for UAV-Assisted Data Collection in IoT Network: A Double Deep Q Network Approach

Joint Neural Network for Trajectory and Communication Design in Multi-DAV Systems

Three-Dimensional Trajectory and Resource Allocation Optimization in Multi-Unmanned Aerial Vehicle Multicast System: A Multi-Agent Reinforcement Learning Method

Cooperative Internet of UAVs: Distributed Trajectory Design by Multi-Agent Deep Reinforcement Learning.

Collaborative Reinforcement Learning Based Unmanned Aerial Vehicle (UAV) Trajectory Design for 3D UAV Tracking

Maximizing UAV Coverage in Maritime Wireless Networks: A Multiagent Reinforcement Learning Approach