Random Network Distillation Based Deep Reinforcement Learning for AGV Path Planning

Huilin Yin,Shengkai Su,Yinjia Lin,Pengju Zhen,Karin Festl,Daniel Watzenig

2024-04-19

Abstract:With the flourishing development of intelligent warehousing systems, the technology of Automated Guided Vehicle (AGV) has experienced rapid growth. Within intelligent warehousing environments, AGV is required to safely and rapidly plan an optimal path in complex and dynamic environments. Most research has studied deep reinforcement learning to address this challenge. However, in the environments with sparse extrinsic rewards, these algorithms often converge slowly, learn inefficiently or fail to reach the target. Random Network Distillation (RND), as an exploration enhancement, can effectively improve the performance of proximal policy optimization, especially enhancing the additional intrinsic rewards of the AGV agent which is in sparse reward environments. Moreover, most of the current research continues to use 2D grid mazes as experimental environments. These environments have insufficient complexity and limited action sets. To solve this limitation, we present simulation environments of AGV path planning with continuous actions and positions for AGVs, so that it can be close to realistic physical scenarios. Based on our experiments and comprehensive analysis of the proposed method, the results demonstrate that our proposed method enables AGV to more rapidly complete path planning tasks with continuous actions in our environments. A video of part of our experiments can be found at

Robotics,Artificial Intelligence,Machine Learning

What problem does this paper attempt to address?

The paper aims to solve the path planning problem of Automated Guided Vehicles (AGVs) in complex dynamic environments. Current methods have low learning efficiency and slow convergence in sparse reward environments. The researchers propose a Deep Reinforcement Learning (DRL) method called RND-PPO based on Random Network Distillation (RND) to enhance the exploration performance of the Proximal Policy Optimization (PPO) algorithm, especially in reward-sparse environments. In most existing research, path planning experiments of AGVs are usually conducted in 2D grid mazes, which are insufficient to simulate the complexity and action set of real environments. Therefore, the paper constructs an AGV path planning simulation environment with continuous actions and positions, which is closer to real physical scenarios. RND-PPO improves the learning effectiveness of the agent (AGV) in sparse reward environments by adding intrinsic rewards, allowing the AGV to complete tasks with continuous actions faster. The experimental results show that the RND-PPO method can efficiently and stably complete path planning tasks in complex environments, especially in the presence of dynamic targets, compared to AGVs using only PPO. Through intrinsic rewards, RND-PPO promotes the exploration of the entire environment by the AGV, rather than just finding a single reward target, thus adapting to changes in the external environment. In conclusion, the paper proposes a novel AGV path planning method combining RND, which improves the efficiency of path planning in complex dynamic environments.

Random Network Distillation Based Deep Reinforcement Learning for AGV Path Planning

Learning to Cooperate: Application of Deep Reinforcement Learning for Online AGV Path Finding.

AGV Path Planning Using Curiosity-Driven Deep Reinforcement Learning

Learning Navigation Policies for Mobile Robots in Deep Reinforcement Learning with Random Network Distillation

A decentralized path planning model based on deep reinforcement learning

Research on AGV Path Planning Based on Reinforcement Learning

Path Planning for Autonomous Vehicles in Unknown Dynamic Environment Based on Deep Reinforcement Learning

Deep Reinforcement Learning for Autonomous Ground Vehicle Exploration Without A-Priori Maps

Path Following for Autonomous Ground Vehicle Using DDPG Algorithm: A Reinforcement Learning Approach

Path Planning of Autonomous Mobile Robot in Comprehensive Unknown Environment Using Deep Reinforcement Learning

Intelligent Path Planning for AGV-UAV Transportation in 6G Smart Warehouse

Decentralized Multi-AGV Task Allocation based on Multi-Agent Reinforcement Learning with Information Potential Field Rewards

A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework

Research on Dynamic Path Planning of Multi-AGVs Based on Reinforcement Learning

Multi-AGV Path Planning Method via Reinforcement Learning and Particle Filters

A Self-Attention-Based Deep Reinforcement Learning Approach for AGV Dispatching Systems

Deep Reinforcement Learning for Indoor Mobile Robot Path Planning

Real-time local path planning strategy based on deep distributional reinforcement learning

A* guiding DQN algorithm for automated guided vehicle pathfinding problem of robotic mobile fulfillment systems

Supervised Reinforcement Learning for ULV Path Planning in Complex Warehouse Environment

Intelligent Decision and Path Planning Algorithm of AGV Vehicle based on Deep Learning