Adaptive scheduling for assembly job shop with uncertain assembly times based on dual Q-learning

Haoxiang Wang,Bhaba R. Sarker,Jing Li,Jian Li
DOI: https://doi.org/10.1080/00207543.2020.1794075
IF: 9.018
2020-07-29
International Journal of Production Research
Abstract:To address the uncertainty of production environment in assembly job shop, in combination of the real-time feature of reinforcement learning, a dual Q-learning (D-Q) method is proposed to enhance the adaptability to environmental changes by self-learning for <i>assembly job shop scheduling problem</i>. On the basis of the objective function of minimising the total weighted earliness penalty and completion time cost, the top level Q-learning is focused on localised targets in order to find the dispatching policy which can minimise machine idleness and balance machine loads, and the bottom level Q-learning is focused on global targets in order to learn the optimal scheduling policy which can minimise the overall earliness of all jobs. Some theoretical results and simulation experiments indicate that the proposed algorithm achieves generally better results than the single Q-learning (S-Q) and other scheduling rules, under the arrival frequency of product with different conditions, and show good adaptive performance. <b>Abbreviations:</b> AFSSP, assembly flow shop scheduling problem; AJSSP, assembly job shop scheduling problem; RL, reinforcement learning; TASP, two-stage assembly scheduling problem
engineering, manufacturing, industrial,operations research & management science
What problem does this paper attempt to address?