Abstract:Unmanned Aerial Vehicles (UAVs), also known as drones, have advanced greatly in recent years. There are many ways in which drones can be used, including transportation, photography, climate monitoring, and disaster relief. The reason for this is their high level of efficiency and safety in all operations. While the design of drones strives for perfection, it is not yet flawless. When it comes to detecting and preventing collisions, drones still face many challenges. In this context, this paper describes a methodology for developing a drone system that operates autonomously without the need for human intervention. This study applies reinforcement learning algorithms to train a drone to avoid obstacles autonomously in discrete and continuous action spaces based solely on image data. The novelty of this study lies in its comprehensive assessment of the advantages, limitations, and future research directions of obstacle detection and avoidance for drones, using different reinforcement learning techniques. This study compares three different reinforcement learning strategies—namely, Deep Q-Networks (DQN), Proximal Policy Optimization (PPO), and Soft Actor-Critic (SAC)—that can assist in avoiding obstacles, both stationary and moving; however, these strategies have been more successful in drones. The experiment has been carried out in a virtual environment made available by AirSim. Using Unreal Engine 4, the various training and testing scenarios were created for understanding and analyzing the behavior of RL algorithms for drones. According to the training results, SAC outperformed the other two algorithms. PPO was the least successful among the algorithms, indicating that on-policy algorithms are ineffective in extensive 3D environments with dynamic actors. DQN and SAC, two off-policy algorithms, produced encouraging outcomes. However, due to its constrained discrete action space, DQN may not be as advantageous as SAC in narrow pathways and twists. Concerning further findings, when it comes to autonomous drones, off-policy algorithms, such as DQN and SAC, perform more effectively than on-policy algorithms, such as PPO. The findings could have practical implications for the development of safer and more efficient drones in the future.

Vision-driven UAV River Following: Benchmarking with Safe Reinforcement Learning

Synergistic Reinforcement and Imitation Learning for Vision-driven Autonomous Flight of UAV Along River

Safe Exploration in Wireless Security: A Safe Reinforcement Learning Algorithm With Hierarchical Structure

A Vision Based Deep Reinforcement Learning Algorithm for UAV Obstacle Avoidance

Benchmarking Safe Deep Reinforcement Learning in Aquatic Navigation

Deep Reinforcement Learning for Vision-Based Navigation of UAVs in Avoiding Stationary and Mobile Obstacles

Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning

Deep-Reinforcement-Learning-Based Autonomous UAV Navigation With Sparse Rewards

Deep Interactive Reinforcement Learning for Path Following of Autonomous Underwater Vehicle

Autonomous obstacle avoidance of UAV based on deep reinforcement learning

A Saliency-Based Reinforcement Learning Approach for a UAV to Avoid Flying Obstacles

Path Following Control for Unmanned Surface Vehicles: A Reinforcement Learning-Based Method with Experimental Validation.

A UAV Navigation Approach Based on Deep Reinforcement Learning in Large Cluttered 3D Environments

EPO-S: A Constrained RL Method to Enhance UAV Safety with Spatial Representation

Autonomous Reinforcement Control of Visual Underwater Vehicles: Real-Time Experiments Using Computer Vision

NavRL: Learning Safe Flight in Dynamic Environments

Adaptive Informative Path Planning Using Deep Reinforcement Learning for UAV-based Active Sensing

Evaluation of Safety Constraints in Autonomous Navigation with Deep Reinforcement Learning

Vision-Based Deep Reinforcement Learning of UAV Autonomous Navigation Using Privileged Information

Vision Based Drone Obstacle Avoidance by Deep Reinforcement Learning

A Simulator and First Reinforcement Learning Results for Underwater Mapping