Abstract:Unmanned aerial vehicles (UAVs) have the potential in delivering Internet-of-Things (IoT) services from a great height, creating an airborne domain of the IoT. In this article, we address the problem of autonomous UAV navigation in large-scale complex environments by formulating it as a Markov decision process with sparse rewards and propose an algorithm named deep reinforcement learning (RL) with nonexpert helpers (LwH). In contrast to prior RL-based methods that put huge efforts into reward shaping, we adopt the sparse reward scheme, i.e., a UAV will be rewarded if and only if it completes navigation tasks. Using the sparse reward scheme ensures that the solution is not biased toward potentially suboptimal directions. However, having no intermediate rewards hinders the agent from efficient learning since informative states are rarely encountered. To handle the challenge, we assume that a prior policy (nonexpert helper) that might be of poor performance is available to the learning agent. The prior policy plays the role of guiding the agent in exploring the state space by reshaping the behavior policy used for environmental interaction. It also assists the agent in achieving goals by setting dynamic learning objectives with increasing difficulty. To evaluate our proposed method, we construct a simulator for UAV navigation in large-scale complex environments and compare our algorithm with several baselines. Experimental results demonstrate that LwH significantly outperforms the state-of-the-art algorithms handling sparse rewards and yields impressive navigation policies comparable to those learned in the environment with dense rewards.

Oracle-Guided Deep Reinforcement Learning for Large-Scale Multi-UAVs Flocking and Navigation.

Learning-Based Multi-UAV Flocking Control With Limited Visual Field and Instinctive Repulsion

Graph-Based Multi-agent Reinforcement Learning for Large-Scale UAVs Swarm System Control

Autonomous Navigation of UAVs in Large-Scale Complex Environments: A Deep Reinforcement Learning Approach

Autonomous Navigation of UAV in Large-Scale Unknown Complex Environment with Deep Reinforcement Learning.

Collaborative Decision-Making Method for Multi-UAV Based on Multiagent Reinforcement Learning

UAV Swarm Confrontation Using Hierarchical Multiagent Reinforcement Learning

Deep Reinforcement Learning for Flocking Motion of Multi-UAV systems: Learn from a Digital Twin

PASCAL: PopulAtion-Specific Curriculum-based MADRL for collision-free flocking with large-scale fixed-wing UAV swarms

Application of Deep Reinforcement Learning to UAV Swarming for Ground Surveillance

Deep-Reinforcement-Learning-Based Autonomous UAV Navigation With Sparse Rewards

Multi-UAV Path Planning and Following Based on Multi-Agent Reinforcement Learning

Maximizing UAV Coverage in Maritime Wireless Networks: A Multiagent Reinforcement Learning Approach

Autonomous and cooperative control of UAV cluster with multi-agent reinforcement learning

Cooperative Planning of Multi-Uav Logistics Delivery by Multi-Graph Reinforcement Learning

Research on the Multiagent Joint Proximal Policy Optimization Algorithm Controlling Cooperative Fixed-Wing UAV Obstacle Avoidance

Multi-target tracking for unmanned aerial vehicle swarms using deep reinforcement learning

Group-Based Deep Reinforcement Learning in Multi-UAV Confrontation

Deep Reinforcement Learning-Driven Collaborative Rounding-Up for Multiple Unmanned Aerial Vehicles in Obstacle Environments

A Method of Multi-UAV Cooperative Task Assignment Based on Reinforcement Learning

Multi-Agent Reinforcement Learning for Unmanned Aerial Vehicle Coordination by Multi-Critic Policy Gradient Optimization