Abstract:The Flexible Job Shop Scheduling Problem (FJSP), a classic NP-hard optimization challenge, has a direct impact on manufacturing system efficiency. Considering that the FJSP is more complex than the Job Shop Scheduling Problem (JSSP) due to its involvement of both job and machine selection, we have introduced a collaborative agent reinforcement learning (CARL) architecture to tackle this challenge for the first time. To enhance Co-Markov decision process, we introduced disjunctive graphs for the representation of state features. However, the representation of states and actions often leads to suboptimal solutions due to intricate variability. To achieve superior outcomes, we refined our approach to representing states and actions. During the solving process, we employed Graph Attention Network (GAT) to extract global state information from the disjunctive graph and used a Transformer Encoder to quantitatively capture the competitive relationships among machines. We configured two independent encoder–decoder components for job and machine agents, enabling the generation of two distinct action strategies. Finally, we employed the Soft Actor–Critic (SAC) algorithm and an integrated Deep Q Network (DQN) known as D5QN to train the decision network parameters of job and machine agents. Our experiments revealed that after just one training session, collaborative agents acquired exceptional scheduling strategies. These strategies excel not only in solution quality compared to traditional Priority Dispatching Rules (PDR) but also outperform results achieved by some metaheuristic and reinforcement learning algorithms. Additionally, they exhibit greater speed than OR-Tools. Moreover, the empirical findings on both randomized and benchmark instances underscore the remarkable robustness of our acquired policies in practical, large-scale scenarios. Notably, when confronted with the DPpaulli dataset, characterized by a considerable imbalance between the number of operations and machines, our approach achieved optimality in 11 out of 18 FJSP instances.

Bridging Reinforcement Learning and Planning to Solve Combinatorial Optimization Problems with Nested Sub-Tasks

A Framework to Co-Optimize Robot Exploration and Task Planning in Unknown Environments

A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs

Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization

End-to-end Multi-Target Flexible Job Shop Scheduling with Deep Reinforcement Learning

Deep Reinforcement Learning with Credit Assignment for Combinatorial Optimization

Multi-Objective Combinatorial Optimization Algorithm Based on Asynchronous Advantage Actor–Critic and Graph Transformer Networks

Constrained Combinatorial Optimization with Reinforcement Learning

A Two-stage Framework and Reinforcement Learning-based Optimization Algorithms for Complex Scheduling Problems

A Unified Pre-training and Adaptation Framework for Combinatorial Optimization on Graphs

Deep reinforcement learning for multi-objective combinatorial optimization: A case study on multi-objective traveling salesman problem

Deep Reinforcement Learning Based Optimization Algorithm for Permutation Flow-Shop Scheduling

Combinatorial-hybrid Optimization for Multi-agent Systems under Collaborative Tasks

A novel collaborative agent reinforcement learning framework based on an attention mechanism and disjunctive graph embedding for flexible job shop scheduling problem

DeepCO: Offline Combinatorial Optimization Framework Utilizing Deep Learning

A Reinforcement Learning Environment For Job-Shop Scheduling

RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization Benchmark

Cooperative-Guided Ant Colony Optimization with Knowledge Learning for Job Shop Scheduling Problem

Learning to Solve Combinatorial Optimization under Positive Linear Constraints via Non-Autoregressive Neural Networks

Deep Reinforcement Learning for Combinatorial Optimization: Covering Salesman Problems