Parallel machine scheduling minimizing the mean weighted flow time

Zhang Zhi-cong,Zheng Li
DOI: https://doi.org/10.3969/j.issn.1671-3133.2007.09.005
2007-01-01
Abstract:Parallel machine scheduling problem is common in industry.A Reinforcement Learning(RL)algorithm,Q-learning,was used to solve unrelated parallel machine scheduling problem Qm|rj,sjk,Mj|∑wjfj.The sequence-dependent conversion times and machine eligibility constraint were considered.To convert the scheduling problem into an RL problem,the problem was formulated as Semi-Markov Decision Process by defining system state,actions and the reward function.Four heuristics,WSPT,Weng's Algorithm,Ranking Algorithm(RA)and LFJ-RA,were defined as actions.Q-Learning combining linear gradient-descent function approximation was used to minimize the mean weighted flow time.Q-Learning learned to select optimal or sub-optimal actions at different states through simulation.Experiment results show that Q-Learning is superior to the four heuristics in all test problems.
What problem does this paper attempt to address?