Inspection Robot Navigation Based on Improved TD3 Algorithm

Bo Huang,Jiacheng Xie,Jiawei Yan
DOI: https://doi.org/10.3390/s24082525
IF: 3.9
2024-04-16
Sensors
Abstract:The swift advancements in robotics have rendered navigation an essential task for mobile robots. While map-based navigation methods depend on global environmental maps for decision-making, their efficacy in unfamiliar or dynamic settings falls short. Current deep reinforcement learning navigation strategies can navigate successfully without pre-existing map data, yet they grapple with issues like inefficient training, slow convergence, and infrequent rewards. To tackle these challenges, this study introduces an improved two-delay depth deterministic policy gradient algorithm (LP-TD3) for local planning navigation. Initially, the integration of the long–short-term memory (LSTM) module with the Prioritized Experience Re-play (PER) mechanism into the existing TD3 framework was performed to optimize training and improve the efficiency of experience data utilization. Furthermore, the incorporation of an Intrinsic Curiosity Module (ICM) merges intrinsic with extrinsic rewards to tackle sparse reward problems and enhance exploratory behavior. Experimental evaluations using ROS and Gazebo simulators demonstrate that the proposed method outperforms the original on various performance metrics.
engineering, electrical & electronic,chemistry, analytical,instruments & instrumentation
What problem does this paper attempt to address?
The problem this paper attempts to address is: In mobile robot navigation, existing map-based methods perform poorly in unfamiliar or dynamic environments, while current deep reinforcement learning navigation strategies, although not requiring pre-existing map data, suffer from low training efficiency, slow convergence, and sparse rewards. To overcome these challenges, this study proposes an improved Twin Delayed Deep Deterministic Policy Gradient algorithm (LP-TD3) for local path planning navigation. Specifically, the main objectives of this study include: 1. **Improving the efficiency of experience data utilization**: By integrating the Long Short-Term Memory (LSTM) module and the Prioritized Experience Replay (PER) mechanism into the existing TD3 framework to optimize the training process. 2. **Addressing the sparse reward problem**: Introducing an Intrinsic Curiosity Module (ICM) to combine intrinsic rewards with external rewards to enhance exploratory behavior. 3. **Enhancing navigation performance**: Through experimental validation, demonstrating that the proposed method outperforms the original method in various performance metrics. In summary, this paper aims to enhance the autonomous navigation capability of robots in unknown or dynamic environments through an improved deep reinforcement learning algorithm.