Sim-To-Real Transfer for Visual Reinforcement Learning of Deformable Object Manipulation for Robot-Assisted Surgery

Paul Maria Scheikl,Eleonora Tagliabue,Balázs Gyenes,Martin Wagner,Diego Dall'Alba,Paolo Fiorini,Franziska Mathis-Ullrich
DOI: https://doi.org/10.1109/LRA.2022.3227873
2024-06-10
Abstract:Automation holds the potential to assist surgeons in robotic interventions, shifting their mental work load from visuomotor control to high level decision making. Reinforcement learning has shown promising results in learning complex visuomotor policies, especially in simulation environments where many samples can be collected at low cost. A core challenge is learning policies in simulation that can be deployed in the real world, thereby overcoming the sim-to-real gap. In this work, we bridge the visual sim-to-real gap with an image-based reinforcement learning pipeline based on pixel-level domain adaptation and demonstrate its effectiveness on an image-based task in deformable object manipulation. We choose a tissue retraction task because of its importance in clinical reality of precise cancer surgery. After training in simulation on domain-translated images, our policy requires no retraining to perform tissue retraction with a 50% success rate on the real robotic system using raw RGB images. Furthermore, our sim-to-real transfer method makes no assumptions on the task itself and requires no paired images. This work introduces the first successful application of visual sim-to-real transfer for robotic manipulation of deformable objects in the surgical field, which represents a notable step towards the clinical translation of cognitive surgical robotics.
Robotics
What problem does this paper attempt to address?
This paper investigates the problem of transferring the simulated learning strategy of robotic assistants in surgery to the real world. A reinforcement learning pipeline based on pixel-level domain adaptation is proposed in the study to address the transfer challenge of sim-to-real visual reinforcement learning for deformable object manipulation. After training on tissue traction tasks crucial to clinical surgery and training the pixel-level translated images in simulated environments, the policy can be directly applied to the actual robotic system with original RGB images to achieve tissue traction with a success rate of 50%, without the need for retraining. The main contributions of the paper include: 1. Successfully transferring image-based reinforcement learning strategies from simulation to real surgical robotic systems, achieving a cross-domain from visual simulation to reality, which is an important step for deformable object manipulation in the field of surgical robotics. 2. Introducing a contrastive learning generative adversarial network (GAN) for pixel-level domain adaptation, which requires less data and does not require auxiliary tasks specific to a particular task to stabilize training, thus can be applied to different tasks beyond specific tissue traction tasks. In the study, the visual gap between simulated and real images is addressed through contrastive learning, enabling the learned policy to be successfully applied to actual surgical robotic systems. This approach has great potential for complex surgical tasks that involve deformable tissues, as it can directly learn task-relevant features from sensor data, thus overcoming the limitations of state-based methods.