Abstract:As an important part of the fifth generation (5G) mobile networks, unmanned aerial vehicles (UAVs) have been applied in various communication scenarios due to their high operability and low cost. In this paper, we investigate a multi-UAV communication system with moving users and consider the co-channel interference caused by the transmissions of all other UAVs. To ensure the fairness, we maximize the minimum average user rate during the observed time by jointly optimizing UAVs' trajectories, transmission power, and user association. Considering that UAVs can cover a large area for communications, UAVs do not need to move as soon as the users move. Therefore, a two-timescale structure is proposed for the considered scenario, where the UAVs' trajectories are optimized based on the channel state information (CSI) in a long timescale, while the transmission power and the user association are optimized based on the instantaneous CSI in a short timescale. To effectively tackle this challenging non-convex problem with both discrete and continuous variables, we propose a joint neural network (NN) design, where a deep reinforcement learning based Pointer Network named advantage pointer-critic (APC) is applied to optimize discrete variables and a deep-unfolding NN is used to optimize the continuous variables. Specifically, we first formulate a Markov decision process to model the user association, and then employ the APC network trained by the advantage actor-critic algorithm to address it. The APC network consists of a Pointer Network and a Multilayer Perceptron. As for the deep-unfolding NN, we first develop a block coordinate descent based algorithm to optimize the UAVs' trajectories and transmission power, and then unfold the algorithm into a layer-wise NN with introduced trainable parameters. These two networks are jointly trained in an unsupervised fashion. Simulation results validate that the proposed joint NN significantly outperforms the optimization algorithm with much lower complexity, and achieves good performances on scalability and generalization ability.

Joint Trajectory and Radio Resource Optimization for Autonomous Mobile Robots Exploiting Multi-Agent Reinforcement Learning

Deep Reinforcement Learning Enables Joint Trajectory and Communication in Internet of Robotic Things

Visualizing Multi-Agent Reinforcement Learning for Robotic Communication in Industrial IoT Networks

Joint Neural Network for Trajectory and Communication Design in Multi-DAV Systems

Joint Resource Allocation and Trajectory Design for Multi-UAV Systems With Moving Users: Pointer Network and Unfolding

Intelligent Trajectory Design for RIS-NOMA aided Multi-robot Communications

Energy-Efficient Multi-UAVs Cooperative Trajectory Optimization for Communication Coverage: An MADRL Approach

Joint Communication Resource Allocation and Velocity Optimization in Advanced Air Mobility via Multi-Agent Reinforcement Learning

Non-iterative Optimization of Trajectory and Radio Resource for Aerial Network

Online Decentralized Receding Horizon Trajectory Optimization for Multi-Robot systems

Robust Computation Offloading and Trajectory Optimization for Multi-UAV-Assisted MEC: A Multi-Agent DRL Approach

Joint UAV trajectory and communication design with heterogeneous multi-agent reinforcement learning

Reinforcement Learning based Multi-connectivity Resource Allocation in Factory Automation Systems

Communication-Aware Path Design for Indoor Robots Exploiting Federated Deep Reinforcement Learning

Robot Trajectory Planning With QoS Constrained IRS-assisted Millimeter-Wave Communications

Three-Dimensional Trajectory and Resource Allocation Optimization in Multi-Unmanned Aerial Vehicle Multicast System: A Multi-Agent Reinforcement Learning Method

Three-dimensional deep reinforcement learning for trajectory and resource optimization in UAV communication systems

Joint Sensing Task Assignment and Collision-Free Trajectory Optimization for Mobile Vehicle Networks Using Mean-Field Games

DC-MRTA: Decentralized Multi-Robot Task Allocation and Navigation in Complex Environments

Towards Scalable Continuous-Time Trajectory Optimization for Multi-Robot Navigation