A Soft Actor-Critic Deep Reinforcement-Learning-Based Robot Navigation Method Using LiDAR

Yanjie Liu,Chao Wang,Changsen Zhao,Heng Wu,Yanlong Wei
DOI: https://doi.org/10.3390/rs16122072
IF: 5
2024-06-08
Remote Sensing
Abstract:When there are dynamic obstacles in the environment, it is difficult for traditional path-generation algorithms to achieve desired obstacle-avoidance results. To solve this problem, we propose a robot navigation control method based on SAC (Soft Actor-Critic) Deep Reinforcement Learning. Firstly, we use a fast path-generation algorithm to control the robot to generate expert trajectories when the robot encounters danger as well as when it approaches a target, and we combine SAC reinforcement learning with imitation learning based on expert trajectories to improve the safety of training. Then, for the hybrid data consisting of agent data and expert data, we use an improved prioritized experience replay method to improve the learning efficiency of the policies. Finally, we introduce RNN (Recurrent Neural Network) units into the network structure of the SAC Deep Reinforcement-Learning navigation policy to improve the agent's transfer inference ability in a new environment and obstacle-avoidance ability in dynamic environments. Through simulation and practical experiments, it is fully verified that our method has a higher training efficiency and navigation success rate compared to state-of-the-art reinforcement-learning algorithms, which further enhances the obstacle-avoidance capability of the robot system.
environmental sciences,imaging science & photographic technology,remote sensing,geosciences, multidisciplinary
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in an environment with dynamic obstacles, traditional path - generation algorithms are difficult to achieve an ideal obstacle - avoidance effect. To overcome this challenge, the author proposes a robot - navigation method based on Soft Actor - Critic (SAC) deep reinforcement learning. Specifically, the paper mainly solves the following problems: 1. **Improve navigation ability**: By combining a fast path - generation algorithm with imitation learning based on expert trajectories, the navigation ability of the agent in the initial state is improved, and the safety of training and the convergence speed are increased. 2. **Optimize data utilization efficiency**: Given that expert - trajectory data is more important than agent - trajectory data, the paper improves the priority - calculation method based on TD - error - priority replay technology, increasing the utilization efficiency of data in the experience - replay buffer pool. 3. **Enhance obstacle - avoidance ability in dynamic environments**: Recurrent Neural Network (RNN) units are introduced to improve the obstacle - avoidance ability of the SAC deep - reinforcement - learning navigation strategy in dynamic environments and the transfer - inference ability in new environments. Verified through simulation and practical experiments, this method has higher training efficiency and navigation success rate compared with existing reinforcement - learning algorithms, further enhancing the obstacle - avoidance ability of the robot system. The main contribution of the paper lies in effectively improving the autonomous - navigation performance of robots in complex dynamic environments through the combination of deep - reinforcement learning and imitation learning, as well as the improvement of the experience - replay mechanism.