Abstract:Previous research focuses on approaches of deep reinforcement learning (DRL) to optimize diverse types of the single-objective dynamic flexible job shop scheduling problem (DFJSP), e.g., energy consumption, earliness and tardiness penalty and machine utilization rate, which gain many improvements in terms of objective metrics in comparison with metaheuristic algorithms such as GA (genetic algorithm) and dispatching rules such as MRT (most remaining time first). However, single-objective optimization in the job shop floor cannot satisfy the requirements of modern smart manufacturing systems, and the multiple-objective DFJSP has become mainstream and the core of intelligent workshops. A complex production environment in a real-world factory causes scheduling entities to have sophisticated characteristics, e.g., a job's non-uniform processing time, uncertainty of the operation number and restraint of the due time, avoidance of the single machine's prolonged slack time as well as overweight load, which make a method of the combination of dispatching rules in DRL brought up to adapt to the manufacturing environment at different rescheduling points and accumulate maximum rewards for a global optimum. In our work, we apply the structure of a dual layer DDQN (DLDDQN) to solve the DFJSP in real time with new job arrivals, and two objectives are optimized simultaneously, i.e., the minimization of the delay time sum and makespan. The framework includes two layers (agents): the higher one is named as a goal selector, which utilizes DDQN as a function approximator for selecting one reward form from six proposed ones that embody the two optimization objectives, while the lower one, called an actuator, utilizes DDQN to decide on an optimal rule that has a maximum Q value. The generated benchmark instances trained in our framework converged perfectly, and the comparative experiments validated the superiority and generality of the proposed DLDDQN.

Multi-objective Dynamic AGV Scheduling Method Based on Deep Reinforcement Learning

Research on Cooperative Scheduling of AGV Transportation and Charging in Intelligent Warehouse System Based on Dynamic Task Chain

Research on Multi-AGVs dynamic scheduling based on deep reinforcement learning

Multi-Objective Optimization of AGV Real-Time Scheduling Based on Deep Reinforcement Learning

Multi-AGV Dynamic Scheduling in an Automated Container Terminal: A Deep Reinforcement Learning Approach

Research on Intelligent Dynamic Scheduling Algorithm for Automated Guided Vehicles in Container Terminal Based on Deep Reinforcement Learning

Dynamic flexible scheduling with transportation constraints by multi-agent reinforcement learning

Integrated scheduling optimization of U-shaped automated container terminal under loading and unloading mode

A Multiobjective Reinforcement Learning Approach for AGV Task Clustering

Multi-objective AGV scheduling in an FMS using a hybrid of genetic algorithm and particle swarm optimization

Dynamic Multi-Objective Scheduling for Flexible Job Shop by Deep Reinforcement Learning.

Multi-objective AGV scheduling in an automatic sorting system of an unmanned (intelligent) warehouse by using two adaptive genetic algorithms and a multi-adaptive genetic algorithm

Dynamic Scheduling and Optimization of AGV in Factory Logistics Systems Based on Digital Twin

Fusion Q-Learning Algorithm for Open Shop Scheduling Problem with AGVs

Mission Scheduling of Multi-AGV System with Dynamic Simulation

An effective self-adaptive iterated greedy algorithm for a multi-AGVs scheduling problem with charging and maintenance

Efficient Multi-Objective Optimization on Dynamic Flexible Job Shop Scheduling Using Deep Reinforcement Learning Approach

A new knowledge-guided multi-objective optimisation for the multi-AGV dispatching problem in dynamic production environments

Multi-objective optimization for scheduling multi-load automated guided vehicles with consideration of energy consumption

Dynamic rolling scheduling model for multi-AGVs in automated container terminals based on spatio-temporal position information

A Self-Attention-Based Deep Reinforcement Learning Approach for AGV Dispatching Systems