Abstract:In order to improve the horizontal transportation efficiency of the terminal Automated Guided Vehicles (AGVs), it is necessary to focus on coordinating the time and space synchronization operation of the loading and unloading of equipment, the transportation of equipment during the operation, and the reduction in the completion time of the task. Traditional scheduling methods limited dynamic response capabilities and were not suitable for handling dynamic terminal operating environments. Therefore, this paper discusses how to use delivery task information and AGVs spatiotemporal information to dynamically schedule AGVs, minimizes the delay time of tasks and AGVs travel time, and proposes a deep reinforcement learning algorithm framework. The framework combines the benefits of real-time response and flexibility of the Convolutional Neural Network (CNN) and the Deep Deterministic Policy Gradient (DDPG) algorithm, and can dynamically adjust AGVs scheduling strategies according to the input spatiotemporal state information. In the framework, firstly, the AGVs scheduling process is defined as a Markov decision process, which analyzes the system’s spatiotemporal state information in detail, introduces assignment heuristic rules, and rewards the reshaping mechanism in order to realize the decoupling of the model and the AGVs dynamic scheduling problem. Then, a multi-channel matrix is built to characterize space–time state information, the CNN is used to generalize and approximate the action value functions of different state information, and the DDPG algorithm is used to achieve the best AGV and container matching in the decision stage. The proposed model and algorithm frame are applied to experiments with different cases. The scheduling performance of the adaptive genetic algorithm and rolling horizon approach is compared. The results show that, compared with a single scheduling rule, the proposed algorithm improves the average performance of task completion time, task delay time, AGVs travel time and task delay rate by 15.63%, 56.16%, 16.36% and 30.22%, respectively; compared with AGA and RHPA, it reduces the tasks completion time by approximately 3.10% and 2.40%.

Scheduling of twin automated stacking cranes based on Deep Reinforcement Learning

Deep Reinforcement Learning for Dynamic Twin Automated Stacking Cranes Scheduling Problem

Container stacking optimization based on Deep Reinforcement Learning

A2C-DRL: Dynamic Scheduling for Stochastic Edge-Cloud Environments Using A2C and Deep Reinforcement Learning

Intelligent Scheduling Method for Bulk Cargo Terminal Loading Process Based on Deep Reinforcement Learning

DL-DRL: A double-level deep reinforcement learning approach for large-scale task scheduling of multi-UAV

Stacked Autoencoder-Based Deep Reinforcement Learning for Online Resource Scheduling in Large-Scale MEC Networks

Dynamic flexible scheduling with transportation constraints by multi-agent reinforcement learning

Multi-AGV Dynamic Scheduling in an Automated Container Terminal: A Deep Reinforcement Learning Approach

Research on Multi-AGVs dynamic scheduling based on deep reinforcement learning

Joint optimization of steel plate shuffling and truck loading sequencing based on deep reinforcement learning

Solving flexible job shop scheduling problems via deep reinforcement learning

Deep Reinforcement Learning for Dynamic Flexible Job Shop Scheduling with Random Job Arrival

Effective Multi-User Delay-Constrained Scheduling with Deep Recurrent Reinforcement Learning

A two-stage RNN-based deep reinforcement learning approach for solving the parallel machine scheduling problem with due dates and family setups

Dual Dynamic Attention Network for Flexible Job Scheduling with Reinforcement Learning

Multi-User Delay-Constrained Scheduling With Deep Recurrent Reinforcement Learning

Dynamic scheduling of decentralized high-end equipment R&D projects via deep reinforcement learning

An improved deep reinforcement learning-based scheduling approach for dynamic task scheduling in cloud manufacturing

Scheduling of AGVs in Automated Container Terminal Based on the Deep Deterministic Policy Gradient (DDPG) Using the Convolutional Neural Network (CNN)

Multi-Agent Reinforcement Learning for Real-Time Dynamic Production Scheduling in a Robot Assembly Cell