Hand-in-Hand Guidance: an Explore-Exploit Based Reinforcement Learning Method for Performance Driven Assembly-Adjustment

Guifang Duan,Yunkun Xu,Zhenyu Liu,Jianrong Tan
DOI: https://doi.org/10.1109/tii.2022.3232774
IF: 12.3
2023-01-01
IEEE Transactions on Industrial Informatics
Abstract:Nowadays, most high-precision products are still assembled manually, which leads to a low one-time pass rate of products. Workers need to adjust unqualified products repeatedly based on experience, resulting in inefficiency and poor quality consistency. In this work, we propose an explore-exploit reinforcement learning (EERL) framework to suggest the assembly parameters and quantity that workers need to adjust at each step. Jointed with the pretrained product performance prediction model, EERL can output a sequential decision to guide workers hand in hand to adjust unqualified products. EERL includes a two-phase learning process: 1) exploration; and 2) exploitation. In exploration phase, agents are encouraged by the curiosity to fully explore the qualified assembly states in the assembly-adjustment feature space. The regulated difference of random network distillation is used as a measure of curiosity. During exploitation, the agent is trained to learn an assembly-adjustment guidance policy of moving from any unqualified initial assembly state to the corresponding qualified state while satisfying the assembly-adjustment constraints. The curriculum learning mechanism is introduced to learn effectively in the complex environment with sparse reward and adjustment constraints. The proposed approach is validated on an benchmark optimization function and a case study of the gyroscope. The experimental results demonstrate that the proposed approach outperforms other existing approaches.
What problem does this paper attempt to address?