Abstract:Combining Deep Reinforcement Learning and meta-heuristic techniques represents a new research direction for enhancing the search capabilities of meta-heuristic methods in the context of production scheduling. Q-learning is a prominent reinforcement learning in which its utilization aims to direct the selection of actions, thus preventing the necessity for a random exploration in the iterative process of the metaheuristics. In this study, we provide Q-learning guided algorithms for the Bi-Criteria No-Wait Flowshop Scheduling Problem (NWFSP). The problem is treated as a bi-criteria combinatorial optimization problem where total flow time and makespan are optimized simultaneously. Firstly, a deterministic mixed-integer linear programming (MILP) model is provided. Then, Q-learning guided algorithms are developed: Bi-Criteria Iterated Greedy Algorithm with Q-Learning (BC-IG QL ). Bi-criteria block Insertion Heuristic Algorithm with Q-Learning (BC-BIH QL ). Moreover, the performance of the proposed Q-learning guided algorithms is compared over a collection of Bi-Criteria Genetic Local Search Algorithms (BC-GLS), Bi-Criteria Iterated Greedy Algorithm (BC-IG), Bi-Criteria Iterated Greedy Algorithm with a Local Search (BC-IG ALL ) and Bi-Criteria Variable Block Insertion Heuristic Algorithm (BC-VBIH). The complete computational experiment, performed on 480 problem instances of Vallada et al. (2015), which is known as the VRF benchmark set, indicates that the BC-BIH QL and the BC-IG QL algorithms outperform the BC-GLS, BC-IG, BC-IG ALL, and BC-VBIH algorithms in comparative performance metrics. More specifically, the proposed BC-BIH QL and BC-IG QL algorithms can yield more non-dominated bi-criteria solutions with the most substantial competitiveness than the remaining algorithms. At the same time, both are competitive with each other on the benchmark problems. Moreover, the BC-IG QL algorithm dominates almost 97% and 99% of the solutions reached by the BC-IG, BC-IG ALL , and BC-VBIH algorithms in small and large datasets. Similarly, The BC-BIH QL algorithm dominates almost 98% and 99% of the solutions reached by the BC-IG, BC-IG ALL , and BC-VBIH algorithms in small and large datasets, respectively. This means that, among all the features that have been compared, the Q-learning-guided algorithms demonstrate the highest level of competitiveness. The outcomes of this study encourage us to discover many more bi-criteria NWFSPs to reveal the trade-off between other conflicting objectives, such as makespan & the number of early jobs, to overcome various industries' problems.

Parallel machine scheduling minimizing the mean weighted flow time

Minimizing Mean Weighted Tardiness in Unrelated Parallel Machine Scheduling with Reinforcement Learning

Dynamic parallel machine scheduling with mean weighted tardiness objective by Q-Learning

Worst-case behavior of simple sequencing rules in flow shop scheduling with general position-dependent learning effects

Minimizing the Total Completion Time for Parallel Machine Scheduling with Job Splitting and Learning.

Deep Reinforcement Learning Based Optimization Algorithm for Permutation Flow-Shop Scheduling

Intelligent Decision-Making of Scheduling for Dynamic Permutation Flowshop via Deep Reinforcement Learning

Dynamic flexible scheduling with transportation constraints by multi-agent reinforcement learning

Single-machine scheduling to minimize total convex resource consumption with a constraint on total weighted flow time

A Reinforcement Learning Approach to Robust Scheduling of Permutation Flow Shop

Hierarchical Reinforcement Learning for Multi-Objective Real-Time Flexible Scheduling in a Smart Shop Floor

Reinforcement Learning-Based Multi-Objective of Two-Stage Blocking Hybrid Flow Shop Scheduling Problem

A Two-Machine Learning Date Flow-Shop Scheduling Problem with Heuristics and Population-Based GA to Minimize the Makespan

Solving non-permutation flow-shop scheduling problem via a novel deep reinforcement learning approach

Q-learning Guided Algorithms for Bi-Criteria Minimization of Total Flow Time and Makespan in No-Wait Permutation Flowshops

Integration of deep reinforcement learning and multi-agent system for dynamic scheduling of re-entrant hybrid flow shop considering worker fatigue and skill levels

Efficient Multi-Objective Optimization on Dynamic Flexible Job Shop Scheduling Using Deep Reinforcement Learning Approach

MRLM: A meta-reinforcement learning-based metaheuristic for hybrid flow-shop scheduling problem with learning and forgetting effects

A two-stage RNN-based deep reinforcement learning approach for solving the parallel machine scheduling problem with due dates and family setups

A Deep Reinforcement Learning Approach to the Flexible Flowshop Scheduling Problem with Makespan Minimization

Reinforcement learning for robotic flow shop scheduling with processing time variations