Learning to Regrasp Using Visual–Tactile Representation-Based Reinforcement Learning
Zhuangzhuang Zhang,Han Sun,Zhenning Zhou,Yizhao Wang,Huang Huang,Zhinan Zhang,Qixin Cao
DOI: https://doi.org/10.1109/tim.2024.3470030
IF: 5.6
2024-10-11
IEEE Transactions on Instrumentation and Measurement
Abstract:The open-loop grasp planner, which relies on vision, is prone to failure caused by calibration errors, visual occlusions, and other factors. Additionally, it cannot adapt the grasp pose and gripping force in real time, thereby increasing the risk of potential damage to unidentified objects. This work presents a multimodal regrasp control framework based on deep reinforcement learning (RL). Given a coarse initial grasp pose, the proposed regrasping policy efficiently optimizes grasp pose and gripping force by deeply fusing visual and high-resolution tactile data in a closed-loop fashion. To enhance the sample efficiency and generalization capability of the RL algorithm, this work leverages self-supervision to pretrain a visual-tactile representation model, which serves as a feature extraction network during RL policy training. The RL policy is trained purely in simulation and successfully deployed to a real-world environment via domain adaptation and domain randomization techniques. Extensive experimental results in simulation and real-world environments indicate that the robot guided by the regrasping policy is able to achieve gentle grasping of unknown objects with high success rates. Finally, the comparison results with the state-of-the-art algorithm also demonstrate the superiority of our algorithm.
engineering, electrical & electronic,instruments & instrumentation