Intelligent Decision-Making for 3-Dimensional Dynamic Obstacle Avoidance of UAV Based on Deep Reinforcement Learning

Xiao Han,Jing Wang,Jiayin Xue,Qinyu Zhang
DOI: https://doi.org/10.1109/WCSP.2019.8928110
2019-01-01
Abstract:With the growing utilization of UAV in reconnaissance, agriculture, logistics and entertainment, Autonomous collision avoidance during flight has become a necessary capability for modern UAV to detect the surrounding environment and guarantee their own safety. Autonomous obstacle avoidance is a typical agent decision-making problem. Unfortunately, existing traditional decision-making methods perform poorly in this specific realm, In particular, it is unable to meet the requirements of three-dimensional obstacle avoidance of UAV, so we introduce the deep reinforcement learning (DRL) technique into autonomous obstacle avoidance. We model the obstacle avoidance process as a Markov Decision Process and introduce a structure composed of double joint neural network estimators as the decision-maker, whose input is omnidirectional sonar readings and whose output is a value function estimating future rewards. Also, we propose an adaption in the procedure of memory replay to optimize the sampling, where we assign weights to the transitions and sample them accordingly. Our method is applied in a 3-dimensional physic environment, which contains both random dynamic obstacles and floating bouncing obstacles. The goal of the drone is to reach the terminal point without crash. Double Q Learning method with priority sampling, by comparison, achieves the most excellent performance in our simulation. Compared with the traditional algorithms, the proposed algorithm not only ensures the quality of decision making, enabling the agent to learn the optimal strategy, but also effectively improves the performance of the task and the efficiency of decision making. Simulation results demonstrate its effectiveness.
What problem does this paper attempt to address?