Abstract:Due to the influence of dynamic changes in the manufacturing environment, a single dispatching rule (SDR) cannot consistently attain better results than other rules for dynamic scheduling problems. Dynamic selection of the most appropriate rule from several SDRs based on the Deep Q-Network (DQN) offers better scheduling performance than using an individual SDR. However, the discreteness of action space caused by the DQN and the simplicity of the action as an SDR limit the selection range and restrict performance improvement. Thus, in this paper, we propose a scheduling method based on deep reinforcement learning for the dynamic flexible job-shop scheduling problem (DFJSP), aiming to minimize the mean tardiness. Firstly, a Markov decision process with composite scheduling action is provided to elaborate the flexible job-shop dynamic scheduling process and transform the DFJSP into an RL task. Subsequently, a composite scheduling action aggregated by SDRs and continuous weight variables is designed to provide a continuous rule space and SDR weight selection. Moreover, a reward function related to mean tardiness performance criteria is designed such that maximizing the cumulative reward is equivalent to minimizing the mean tardiness. Finally, a policy network with states as inputs and weights as outputs is constructed to generate the scheduling decision at each decision point. Also, the deep deterministic policy gradient (DDPG) algorithm is used to train the policy network to select the most appropriate weights at each decision point, thereby aggregating the SDRs into a better rule. Results from numerical experiments reveal that the proposed scheduling method achieves significantly better scheduling results than an SDR and the DQN-based method in dynamically changeable manufacturing environments.

Spacecraft Resources Dynamic Scheduling Strategy Based on Reinforcement Learning

Spacecraft Attitude Maneuver Planning Based on Deep Reinforcement Learning under Complex Constraints

Deep Reinforcement Learning-Based Autonomous Mission Planning Method for High and Low Orbit Multiple Agile Earth Observing Satellites

Deep Reinforcement Learning-Based Periodic Earth Observation Scheduling for Agile Satellite Constellation.

Event-Triggered Deep Reinforcement Learning for Dynamic Task Scheduling in Multi-Satellite Resource Allocation

A Hierarchical Resource Scheduling Method for Satellite Control System Based on Deep Reinforcement Learning

A Dual-System Reinforcement Learning Method for Flexible Job Shop Dynamic Scheduling

Resource allocation strategy of space cloud network based on resource clustering

Dynamic scheduling of decentralized high-end equipment R&D projects via deep reinforcement learning

A deep reinforcement learning approach for dynamic task scheduling of flight tests

Dynamic scheduling for flexible job shop using a deep reinforcement learning approach

Autonomous imaging scheduling networks of small celestial bodies flyby based on deep reinforcement learning

Reinforcement Learning for Adaptive Resource Scheduling in Complex System Environments

Space Information Network Resource Scheduling for Cloud Computing: A Deep Reinforcement Learning Approach

Autonomous spacecraft collision avoidance with a variable number of space debris based on safe reinforcement learning

OPTIMIZATION ALGORITHM FOR INTERPLANETARY TRANSFER TRAJECTORIES OF SOLAR SAILCRAFT BASED ON DEEP REINFORCEMENT LEARNING

A New Approach for Resource Scheduling with Deep Reinforcement Learning

A Fast Approach to Satellite Range Rescheduling Using Deep Reinforcement Learning

Reliable Scheduling Algorithm for Space Debris Monitoring Resources Using Adaptive Multipopulation Differential Evolutionary Optimization With Reinforcement Learning

A2C-DRL: Dynamic Scheduling for Stochastic Edge-Cloud Environments Using A2C and Deep Reinforcement Learning

Real-time Control for Fuel-Optimal Moon Landing Based on an Interactive Deep Reinforcement Learning Algorithm