Abstract:In this article, we focus on a downlink cellular network, where multiple unmanned aerial vehicles (UAVs) serve as aerial base stations for ground users through frequency-division multiple access (FDMA). With user locations and channel parameters inaccessible, the UAVs coordinate to make a decision on resource allocation and trajectory design in a decentralized way. Aiming at optimizing both overall and fairness throughput, we model resource allocation and trajectory design as a decentralized partially observable Markov decision process (Dec-POMDP) and propose multiagent reinforcement learning (RL) as a solution. Specifically, we use parameterized deep $Q$ -network (P-DQN) for the action space comprising both discrete and continuous actions and the QMIX framework is leveraged to aggregate each UAV’s local critics. For fairness throughput optimization, we introduce an entropy-like fairness indicator to the reward to make the total return decomposable. In addition, we further propose a novel distributed learning framework for overall throughput optimization such that each UAV can contribute its local gradient, and model training can be implemented in parallel without need of observation data sharing among the UAVs. Simulation results show that the proposed multiagent RL approach as well as the distributed learning framework are efficient in model training and present acceptable performance close to that achieved by deterministic optimization, which relies on convention optimization techniques with user locations and channel parameters explicitly known beforehand. For fairness throughput optimization, we also show that ground users achieve individual throughputs close to each other, which verifies the effectiveness of the proposed fairness indicator as the reward definition in the RL framework.

Multi-Agent Reinforcement Learning-Based Resource Sharing in Multi-UAV Wireless Networks

Three-Dimensional Trajectory and Resource Allocation Optimization in Multi-Unmanned Aerial Vehicle Multicast System: A Multi-Agent Reinforcement Learning Method

Matching combined multi-agent reinforcement learning for uav secure data dissemination

Mean-Field Multi-Agent Reinforcement Learning for UAV Assisted Secure Data Dissemination.

Resource Allocation in UAV-D2D Networks: A Scalable Heterogeneous Multi-Agent Deep Reinforcement Learning Approach

Resource Allocation and Trajectory Optimization in Multi-UAV Collaborative Vehicular Networks: an Extended Multi-Agent DRL Approach

Multi-Agent Low-Bias Reinforcement Learning for Resource Allocation in UAV-Assisted Networks

Resource Allocation and Trajectory Design in UAV-Aided Cellular Networks Based on Multiagent Reinforcement Learning

Joint Resource Allocation for UAV-assisted V2X Communication with Mean Field Multi-Agent Reinforcement Learning

Multi-Agent Reinforcement Learning Aided Resource Allocation with SARSA in UAV Networks

UAV-assisted fair communications for multi-pair users: A multi-agent deep reinforcement learning method

Joint UAV trajectory and communication design with heterogeneous multi-agent reinforcement learning

Deep Reinforcement Learning Based Resource Allocation in Multi-UAV-Aided MEC Networks.

Multi‐agent Reinforcement Learning Based Transmission Scheme for IRS‐assisted Multi‐uav Systems

Power Allocation and Energy Cooperation for UAV-Enabled MmWave Networks: A Multi-Agent Deep Reinforcement Learning Approach

Multi-Agent Reinforcement Learning for Joint Cooperative Spectrum Sensing and Channel Access in Cognitive UAV Networks

UAV-enabled Collaborative Beamforming via Multi-Agent Deep Reinforcement Learning

Resource Allocation in UAV-Assisted Networks: A Clustering-Aided Reinforcement Learning Approach

Dense Multi-Agent Reinforcement Learning Aided Multi-UAV Information Coverage for Vehicular Networks

Trajectory Design and Bandwidth Assignment for UAVs-enabled Communication Network with Multi - Agent Deep Reinforcement Learning.

Multi-Agent DRL for Air-to-Ground Communication Planning in UAV-Enabled IoT Networks