TD3 Based Collision Free Motion Planning for Robot Navigation

Hao Liu,Yi Shen,Chang Zhou,Yuelin Zou,Zijun Gao,Qi Wang
2024-05-24
Abstract:This paper addresses the challenge of collision-free motion planning in automated navigation within complex environments. Utilizing advancements in Deep Reinforcement Learning (DRL) and sensor technologies like LiDAR, we propose the TD3-DWA algorithm, an innovative fusion of the traditional Dynamic Window Approach (DWA) with the Twin Delayed Deep Deterministic Policy Gradient (TD3). This hybrid algorithm enhances the efficiency of robotic path planning by optimizing the sampling interval parameters of DWA to effectively navigate around both static and dynamic obstacles. The performance of the TD3-DWA algorithm is validated through various simulation experiments, demonstrating its potential to significantly improve the reliability and safety of autonomous navigation systems.
Robotics
What problem does this paper attempt to address?
This paper aims to solve the collision - free motion planning problem of robots in complex environments. Specifically, the paper proposes a TD3 - DWA algorithm based on Deep Reinforcement Learning (DRL) and LiDAR sensor technology. By optimizing the sampling interval parameter of the traditional Dynamic Window Approach (DWA), this algorithm improves the efficiency of robot path planning, enabling it to effectively avoid static and dynamic obstacles. ### Main contributions of the paper: 1. **Algorithm innovation**: Integrates DWA and TD3 (Twin Delayed Deep Deterministic Policy Gradient) and proposes a new TD3 - DWA algorithm. 2. **Performance improvement**: By optimizing the sampling interval parameter of DWA, the efficiency and reliability of path planning are improved. 3. **Experimental verification**: Through a variety of simulation experiments, the performance of the TD3 - DWA algorithm in different environments, especially in dynamic environments, is verified. ### Key technical points: - **DWA algorithm**: A traditional local path - planning algorithm that selects the optimal path by sampling speed, predicting trajectories, and evaluating functions. - **TD3 algorithm**: A reinforcement learning algorithm for continuous action spaces. By introducing two independent Q - networks and a delayed update strategy, it solves the over - estimation problem in the DDPG algorithm. - **Fusion method**: Combines TD3 with DWA, and uses the linear and angular velocities generated by TD3 directly for path planning, improving the smoothness and real - time performance of the path. ### Experimental results: - **Static obstacle environment**: In 100 tests, the TD3 - DWA algorithm had only 2 collisions, and the average path time and length were 11.6 seconds and 11.6 meters respectively. - **Dynamic obstacle environment**: In a 25x25 - meter environment, the TD3 - DWA algorithm had no collisions at all, and the average path time and length were 22.3 seconds and 20.4 meters respectively. ### Conclusion: The TD3 - DWA algorithm proposed in this paper performs excellently in collision - free motion planning in complex environments, significantly improving the efficiency and reliability of path planning. Through comparison experiments with traditional DWA and other algorithms, the effectiveness and superiority of this algorithm are verified.