Q-learning Guided Algorithms for Bi-Criteria Minimization of Total Flow Time and Makespan in No-Wait Permutation Flowshops
Damla Yüksel,Levent Kandiller,Mehmet Fatih Taşgetiren
DOI: https://doi.org/10.1016/j.swevo.2024.101617
IF: 10.267
2024-05-30
Swarm and Evolutionary Computation
Abstract:Combining Deep Reinforcement Learning and meta-heuristic techniques represents a new research direction for enhancing the search capabilities of meta-heuristic methods in the context of production scheduling. Q-learning is a prominent reinforcement learning in which its utilization aims to direct the selection of actions, thus preventing the necessity for a random exploration in the iterative process of the metaheuristics. In this study, we provide Q-learning guided algorithms for the Bi-Criteria No-Wait Flowshop Scheduling Problem (NWFSP). The problem is treated as a bi-criteria combinatorial optimization problem where total flow time and makespan are optimized simultaneously. Firstly, a deterministic mixed-integer linear programming (MILP) model is provided. Then, Q-learning guided algorithms are developed: Bi-Criteria Iterated Greedy Algorithm with Q-Learning (BC-IG QL ). Bi-criteria block Insertion Heuristic Algorithm with Q-Learning (BC-BIH QL ). Moreover, the performance of the proposed Q-learning guided algorithms is compared over a collection of Bi-Criteria Genetic Local Search Algorithms (BC-GLS), Bi-Criteria Iterated Greedy Algorithm (BC-IG), Bi-Criteria Iterated Greedy Algorithm with a Local Search (BC-IG ALL ) and Bi-Criteria Variable Block Insertion Heuristic Algorithm (BC-VBIH). The complete computational experiment, performed on 480 problem instances of Vallada et al. (2015), which is known as the VRF benchmark set, indicates that the BC-BIH QL and the BC-IG QL algorithms outperform the BC-GLS, BC-IG, BC-IG ALL, and BC-VBIH algorithms in comparative performance metrics. More specifically, the proposed BC-BIH QL and BC-IG QL algorithms can yield more non-dominated bi-criteria solutions with the most substantial competitiveness than the remaining algorithms. At the same time, both are competitive with each other on the benchmark problems. Moreover, the BC-IG QL algorithm dominates almost 97% and 99% of the solutions reached by the BC-IG, BC-IG ALL , and BC-VBIH algorithms in small and large datasets. Similarly, The BC-BIH QL algorithm dominates almost 98% and 99% of the solutions reached by the BC-IG, BC-IG ALL , and BC-VBIH algorithms in small and large datasets, respectively. This means that, among all the features that have been compared, the Q-learning-guided algorithms demonstrate the highest level of competitiveness. The outcomes of this study encourage us to discover many more bi-criteria NWFSPs to reveal the trade-off between other conflicting objectives, such as makespan & the number of early jobs, to overcome various industries' problems.
computer science, artificial intelligence, theory & methods