Trajectory Design and Access Control for Air–Ground Coordinated Communications System With Multiagent Deep Reinforcement Learning

Ruijin Ding,Yadong Xu,Feifei Gao,Xuemin Shen

DOI: https://doi.org/10.1109/jiot.2021.3062091

IF: 10.6

2022-04-15

IEEE Internet of Things Journal

Abstract:Unmanned-aerial-vehicle (UAV)-assisted communications has attracted increasing attention recently. This article investigates air–ground coordinated communications system, in which trajectories of air UAV base stations (UAV-BSs) and access control of ground users (GUs) are jointly optimized. We formulated this optimization problem as a mixed cooperative–competitive game, where each GU competes for the limited resources of UAV-BSs to maximize its own throughput by accessing a suitable UAV-BS, and UAV-BSs cooperate with each other and design their trajectories to maximize the defined fair throughput to improve the total throughput and keep the GU fairness. Moreover, the action space of GUs is discrete, while that of UAV-BS is continuous. To tackle this hybrid action space issue, we transform the discrete actions into continuous action probabilities and propose a multiagent deep reinforcement learning (MADRL) approach, named air–ground probabilistic multiagent deep deterministic policy gradient (AG-PMADDPG). With well-designed rewards, AG-PMADDPG can coordinate two types of agents, UAV-BSs and GUs, to achieve their own objectives based on local observations. Simulation results demonstrate that AG-PMADDPG can outperform the benchmark algorithms in terms of throughput and fairness.

computer science, information systems,telecommunications,engineering, electrical & electronic

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to optimize the trajectory design of unmanned aerial vehicle base stations (UAV - BSs) and the access control of ground users (GUs) in the unmanned aerial vehicle - assisted communication system to improve the overall performance of the system. Specifically, the paper focuses on how to jointly optimize the flight path of UAV - BSs and the access selection of ground users through the multi - agent deep reinforcement learning (MADRL) method in an air - ground collaborative communication system, so as to achieve high - throughput and fair communication services. The main challenges in the paper include: - **Hybrid action space**: The action space of ground users is discrete, while that of UAV - BSs is continuous, which leads to a hybrid action space problem. - **Non - convex optimization problem**: The trajectory design of UAV - BSs is a sequential optimization problem with a large number of decision variables and is non - convex, which is very difficult to solve directly. - **Multi - objective optimization**: Ground users and UAV - BSs have different optimization objectives. The goal of ground users is to maximize their own long - term throughput, while the goal of UAV - BSs is to maximize the defined fair throughput, that is, to maintain fairness among users while increasing the total throughput. To address these challenges, the paper proposes a method named AG - PMADDPG (Air - Ground Probabilistic Multi - Agent Deep Deterministic Policy Gradient), which can handle the hybrid action space problem and coordinate different types of agents (UAV - BSs and ground users) through appropriate reward design, enabling them to achieve their respective goals based on local observations.

Trajectory Design and Access Control for Air–Ground Coordinated Communications System With Multiagent Deep Reinforcement Learning

Air-Ground Coordination Communication by Multi-Agent Deep Reinforcement Learning

Joint UAV trajectory and communication design with heterogeneous multi-agent reinforcement learning

Energy-Efficient Multi-UAVs Cooperative Trajectory Optimization for Communication Coverage: An MADRL Approach

Joint Neural Network for Trajectory and Communication Design in Multi-DAV Systems

Cellular UAV-to-Device Communications: Trajectory Design and Mode Selection by Multi-agent Deep Reinforcement Learning

Graph Attention-based Reinforcement Learning for Trajectory Design and Resource Assignment in Multi-UAV Assisted Communication

Three-Dimension Trajectory Design for Multi-UAV Wireless Network With Deep Reinforcement Learning

Multi-Agent Deep Reinforcement Learning for Joint Decoupled User Association and Trajectory Design in Full-Duplex Multi-UAV Networks

Federated deep reinforcement learning based trajectory design for UAV-assisted networks with mobile ground devices

Multi-Agent DRL for Air-to-Ground Communication Planning in UAV-Enabled IoT Networks

Multi-Agent Deep Reinforcement Learning for Secure UAV Communications

Cooperative Internet of UAVs: Distributed Trajectory Design by Multi-Agent Deep Reinforcement Learning.

Joint Resource Allocation and Trajectory Design for Multi-UAV Systems With Moving Users: Pointer Network and Unfolding

Distributed Federated Deep Reinforcement Learning Based Trajectory Optimization for Air-Ground Cooperative Emergency Networks

UAV-Enabled Secure Communications by Multi-Agent Deep Reinforcement Learning

Deep Reinforcement Learning for Joint Trajectory Planning, Transmission Scheduling, and Access Control in UAV-Assisted Wireless Sensor Networks

Dynamic Trajectory and Power Control in Ultra-Dense UAV Networks: A Mean-Field Reinforcement Learning Approach

Mobility-Aware Trajectory Design For Aerial Base Station Using Deep Reinforcement Learning

Resource Allocation in UAV-D2D Networks: A Scalable Heterogeneous Multi-Agent Deep Reinforcement Learning Approach

Three-Dimensional Trajectory and Resource Allocation Optimization in Multi-Unmanned Aerial Vehicle Multicast System: A Multi-Agent Reinforcement Learning Method