Multi-objective Path Planning Based on Deep Reinforcement Learning

Jian Xu,Fei Huang,Yunfei Cui,Xue Du
DOI: https://doi.org/10.23919/ccc55666.2022.9902302
2022-01-01
Abstract:Before an intelligent robot needs to go to multiple target areas to perform different tasks, it is essential to plan a safe and shortest path for the completion of the task. In this paper, the deep reinforcement learning method is introduced and the hierarchical Twin Delayed Deep Deterministic policy gradient (HTD3) algorithm is proposed in view of Twin Delayed Deep Deterministic policy gradient (TD3) algorithm, so that the intelligent robot can learn the optimal multi-objective path planning policy independently. The main idea of this paper is to hierarchically process multiple objectives according to their priorities, that is, each objective has one layer. Firstly, a state vector representing different layers is defined to layer the targets with different priorities, and the switching conditions between layers are given. Secondly, a unified hierarchical state representation and hierarchical reward function are defined for all targets, which can change automatically on the basis of the hierarchical state vector. Finally, a novel HTD3 algorithm is put forward via combining the above layered method and TD3 algorithm. We train and evaluate the proposed algorithm in two-dimensional environment and three-dimensional environment respectively, in which the simulation results show that the proposed HTD3 algorithm can effectively complete multi-objective path planning in various environments.
What problem does this paper attempt to address?