Online Optimization Method of Controller Parameters for Robot Constant Force Grinding Based on Deep Reinforcement Learning Rainbow

Tie Zhang,Chao Yuan,Yanbiao Zou
DOI: https://doi.org/10.1007/s10846-022-01688-z
2022-08-03
Journal of Intelligent and Robotic Systems: Theory and Applications
Abstract:The robot grinding process requires a high level of real-time constant force control. However, the low stiffness of the robot makes it difficult to keep the grinding force stable due to the deformation at the end of the robot during grinding. Therefore, an online optimization method of controller parameters for constant force grinding based on the deep reinforcement learning Rainbow is proposed in this paper. The method can stabilize the convergence while optimizing the control parameters online and can solve the constant force control during the grinding process. Firstly, based on the force analysis of the grinding model, we established the relationship between the grinding force and the trajectory compensation, and the robot constant force grinding controller was established by obtaining the grinding force to adjust the grinding trajectory. Secondly, aiming at adjusting the controller parameters of robot constant force grinding, we selected Rainbow to optimize the controller parameters and designed the training environment for the reinforcement learning agent. The environment includes the state space composed of different controller parameters, the action space of state transformation, the Noisy Categorical Dueling (NCD) network mapping agent state and action reward, and the reward function with two-stage sample evaluation. The online optimization of constant force controller parameters is realized to reduce the training time cost. To improve the safety of grinding force overload during sampling and the utilization rate of sample data, an online sampling method of grinding force was proposed. Through the experiment of the robotic constant force grinding system, Rainbow can quickly converge to the stable strategy. The results show that the grinding force is more consistently close to the expected value and the roughness of the machined surface decreases. Compared with the parameters empirically adjusted and the parameters with a poor grinding performance by sampling, the roughness was reduced by 26.54% and 78.39% respectively, verifying the effectiveness and practicality of the proposed method.
What problem does this paper attempt to address?