Improved Double Deep Q-Network Algorithm Applied to Multi-Dimensional Environment Path Planning of Hexapod Robots

Liuhongxu Chen,Qibiao Wang,Chao Deng,Bo Xie,Xianguo Tuo,Gang Jiang
DOI: https://doi.org/10.3390/s24072061
IF: 3.9
2024-03-24
Sensors
Abstract:Detecting transportation pipeline leakage points within chemical plants is difficult due to complex pathways, multi-dimensional survey points, and highly dynamic scenarios. However, hexapod robots' maneuverability and adaptability make it an ideal candidate for conducting surveys across different planes. The path-planning problem of hexapod robots in multi-dimensional environments is a significant challenge, especially when identifying suitable transition points and planning shorter paths to reach survey points while traversing multi-level environments. This study proposes a Particle Swarm Optimization (PSO)-guided Double Deep Q-Network (DDQN) approach, namely, the PSO-guided DDQN (PG-DDQN) algorithm, for solving this problem. The proposed algorithm incorporates the PSO algorithm to supplant the traditional random selection strategy, and the data obtained from this guided approach are subsequently employed to train the DDQN neural network. The multi-dimensional random environment is abstracted into localized maps comprising current and next level planes. Comparative experiments were performed with PG-DDQN, standard DQN, and standard DDQN to evaluate the algorithm's performance by using multiple randomly generated localized maps. After testing each iteration, each algorithm obtained the total reward values and completion times. The results demonstrate that PG-DDQN exhibited faster convergence under an equivalent iteration count. Compared with standard DQN and standard DDQN, reductions in path-planning time of at least 33.94% and 42.60%, respectively, were observed, significantly improving the robot's mobility. Finally, the PG-DDQN algorithm was integrated with sensors onto a hexapod robot, and validation was performed through Gazebo simulations and Experiment. The results show that controlling hexapod robots by applying PG-DDQN provides valuable insights for path planning to reach transportation pipeline leakage points within chemical plants.
engineering, electrical & electronic,chemistry, analytical,instruments & instrumentation
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve the path - planning problem of hexapod robots in multi - dimensional environments, especially in the application scenario of detecting leakage points in transportation pipelines in chemical plants. Specifically, the paper focuses on the following challenges: 1. **Path planning in complex environments**: - The pipeline layout in chemical plants is complex, with multi - dimensional survey points and highly dynamic scenes, which makes it difficult for traditional path - planning methods to be effectively applied. - The mobility and adaptability of hexapod robots make them an ideal choice for surveys on different planes, but how to accurately find transition points and plan the shortest path in multi - dimensional environments remains a significant challenge. 2. **Improving path - planning efficiency**: - The performance of traditional path - planning algorithms in multi - dimensional environments is often not satisfactory, especially in terms of selecting appropriate transition points and planning the shortest path. - An algorithm that can converge quickly and perform better with the same number of iterations is needed to reduce path - planning time and improve the movement efficiency of robots. 3. **Integrating sensors and simulation verification**: - To ensure the effectiveness of the path - planning algorithm, it is necessary to integrate the algorithm with sensors and verify it through simulation and experiments. - Through these verifications, the feasibility and reliability of the algorithm in practical applications can be evaluated. ### Solutions To solve the above problems, the paper proposes a double - deep Q - network (DDQN) algorithm guided by particle swarm optimization (PSO), namely the PSO - guided DDQN (PG - DDQN) algorithm. The main improvements of this algorithm include: 1. **PSO algorithm replaces the random selection strategy**: - Utilize the global search ability and parallel computing ability of the PSO algorithm to replace the traditional random search strategy and accelerate the accumulation of effective data in the experience pool. - This improvement enhances the DDQN algorithm's ability to obtain environmental data, thereby accelerating the convergence speed of the algorithm. 2. **Improved neural network structure**: - Use a four - input - layer convolutional neural network structure instead of the traditional BP neural network architecture, reducing the number of learning parameters and improving the efficiency of the algorithm. - Abstract the multi - dimensional environment into multiple grid planes, pre - process the map data, and decompose the overall path planning into multiple local plans, further optimizing the performance of the algorithm. 3. **Design of the reward mechanism**: - Design a detailed reward structure, including immediate rewards, collision penalties, target point selection rewards, end - point arrival rewards, etc., to provide effective feedback and guide the robot to make optimal decisions. Through these improvements, the PG - DDQN algorithm shows faster convergence speed and shorter path - planning time in path - planning tasks in multi - dimensional environments, significantly improving the movement efficiency of hexapod robots. Finally, through Gazebo simulation and experimental verification, the effectiveness and reliability of this algorithm in practical applications are proved.