Abstract:The emerging backscatter communication technology is recognized as a promising solution to the battery problem of Internet of Things (IoT) devices. For example, the wireless sensor network with backscatter communication technology can monitor the environment in remote areas without battery maintenance or replacement. Unfortunately, the transmission range of backscatter communication is limited. To tackle this challenge, we propose a multi-UAV-aided data collection scenario where the unmanned aerial vehicle (UAV) can fly close to the backscatter sensor node (BSN) to activate it and then collects the data. We aim to minimize the total flight time of the rechargeable UAVs when the collection mission is finished. During the data collection process, the UAVs can return to the charging station to recharge itself when the energy of UAV is not sufficient to complete the mission. To reduce the complexity of the task, we first use the Gaussian mixture model clustering method to divide the BSNs into multiple clusters. Then we consider the deterministic boundary and ambiguous boundary for the UAV flying regions, respectively. For the deterministic boundary scenario, we propose a single-agent deep option learning (SADOL) algorithm, where each UAV cannot fly beyond the deterministic boundary. For the ambiguous boundary scenario, we propose a multiagent deep option learning (MADOL) algorithm to enable the UAVs to cooperatively learn the ambiguous BSNs assignment. In the simulation, we compare the proposed algorithms with multiagent deep deterministic policy gradient (MADDPG), deep deterministic policy gradient (DDPG), and deep Q-network (DQN) algorithms, which proves the proposed algorithms can achieve better performance.

Flocking Control of UAV Swarms with Deep Reinforcement Leaming Approach

Learning-Based Multi-UAV Flocking Control With Limited Visual Field and Instinctive Repulsion

Oracle-Guided Deep Reinforcement Learning for Large-Scale Multi-UAVs Flocking and Navigation.

Model-free Maneuvering Control of Fixed-Wing UAVs Based on Deep Reinforcement Learning

Flocking of Under-Actuated Unmanned Surface Vehicles via Deep Reinforcement Learning and Model Predictive Path Integral Control

PPO-Exp: Keeping Fixed-Wing UAV Formation with Deep Reinforcement Learning

Application of Deep Reinforcement Learning to UAV Swarming for Ground Surveillance

Deep Reinforcement Learning for Flocking Motion of Multi-UAV systems: Learn from a Digital Twin

PASCAL: PopulAtion-Specific Curriculum-based MADRL for collision-free flocking with large-scale fixed-wing UAV swarms

Joint Communication and Action Learning in Multi-Target Tracking of UAV Swarms with Deep Reinforcement Learning

Deep Reinforcement Learning-Driven Collaborative Rounding-Up for Multiple Unmanned Aerial Vehicles in Obstacle Environments

Mean Field Deep Reinforcement Learning for Fair and Efficient UAV Control

Multi-UAV Flocking Control with a Hierarchical Collective Behavior Pattern Inspired by Sheep

Collision-Avoiding Flocking With Multiple Fixed-Wing UAVs in Obstacle-Cluttered Environments: A Task-Specific Curriculum- Based MADRL Approach

Flocking Control of Fixed-Wing UAVs with Cooperative Obstacle Avoidance Capability

Sub-optimal Policy Aided Multi-Agent Reinforcement Learning for Flocking Control

Graph-Based Multi-agent Reinforcement Learning for Large-Scale UAVs Swarm System Control

UAV Swarm Confrontation Using Hierarchical Multiagent Reinforcement Learning

Hierarchical Deep Reinforcement Learning for Backscattering Data Collection With Multiple UAVs

Continuous Deep Hierarchical Reinforcement Learning for Ground-Air Swarm Shepherding

Control and Coordination of a SWARM of Unmanned Surface Vehicles using Deep Reinforcement Learning in ROS