Deep Reinforcement Learning for Peg-in-hole Assembly Task Via Information Utilization Method

Wang Fei,Cui Ben,Liu Yue,Ren Baiming
DOI: https://doi.org/10.1007/s10846-022-01713-1
2022-01-01
Journal of Intelligent & Robotic Systems
Abstract:Deep reinforcement learning has been widely studied in many fields of robotics. However, the application of the algorithm is seriously restricted by its low convergence efficiency. Although demonstration information can effectively improve the convergence speed, relying too much on demonstration information will reduce the training effect in the real environment and make the convergence effect worse. In addition, historical information should also be considered, as it will affect the utilization efficiency of information and convergence effect of the algorithm. However, there are few studies on this part at present. This paper proposes an improved reinforcement learning algorithm, which introduces the demonstration information utilization mechanism and LSTM network based on the Proximal Policy Optimization algorithm(PPO). Demonstration information is introduced to provide a priori knowledge base for robots, and a utilization mechanism for demonstration information is established to balance the utilization of teaching information and interactive information. So that the data efficiency can be improved. In addition, we reconstruct the network structure in deep reinforcement learning to introduce historical information. Experimental results show that the method is feasible. Compared with the existing solutions, our method significantly improves the convergence effect of robot autonomous learning.
What problem does this paper attempt to address?