Disassembly Sequence Planning for Target Parts of End-of-life Smartphones Using Q-learning Algorithm

Zepeng Chen,Lin Li,Fu Zhao,John W. Sutherland,Fengfu Yin
DOI: https://doi.org/10.1016/j.procir.2023.02.115
2023-01-01
Procedia CIRP
Abstract:Owing to increasing environmental concerns, disassembly has become an important step in value recovery from end-of-life (EoL) products. As one of the prevalent electronic devices, smartphones are especially challenging to manage at the end of life. They have a diverse internal structure that leads to difficulties in modeling and developing general plans for disassembly. Given this background, an improved method that uses a Q-learning algorithm is proposed to optimize the disassembly sequence of EoL smartphones. A constraint relationship is first developed of EoL smartphone parts. The disassembly sequence planning problem is then formalized with a Markov Decision Process. The optimization objective is the disassembly time to obtain the target parts. The disassembly time and the target parts are converted into a reward value for the Q-learning algorithm. A State-Action-Reward Matrix is established based on converted reward value and disassembly sequence planning problem. A Q-table is trained to find the best action for each state using the State-Action-Reward Matrix. The disassembly sequence with the maximum reward value is secured through the trained Q-table and objective function. A case study of the ‘Xiaomi 5’ is conducted to verify the applicability of the proposed method. The experimental results demonstrate that the proposed method can provide feasible sequence planning for targeted parts in the disassembly of EoL smartphones.
What problem does this paper attempt to address?