Abstract:Introduction: Remote military operations require rapid response times for effective relief and critical care. Yet, the military theater is under austere conditions, so communication links are unreliable and subject to physical and virtual attacks and degradation at unpredictable times. Immediate medical care at these austere locations requires semi-autonomous teleoperated systems, which enable the completion of medical procedures even under interrupted networks while isolating the medics from the dangers of the battlefield. However, to achieve autonomy for complex surgical and critical care procedures, robots require extensive programming or massive libraries of surgical skill demonstrations to learn effective policies using machine learning algorithms. Although such datasets are achievable for simple tasks, providing a large number of demonstrations for surgical maneuvers is not practical. This article presents a method for learning from demonstration, combining knowledge from demonstrations to eliminate reward shaping in reinforcement learning (RL). In addition to reducing the data required for training, the self-supervised nature of RL, in conjunction with expert knowledge-driven rewards, produces more generalizable policies tolerant to dynamic environment changes. A multimodal representation for interaction enables learning complex contact-rich surgical maneuvers. The effectiveness of the approach is shown using the cricothyroidotomy task, as it is a standard procedure seen in critical care to open the airway. In addition, we also provide a method for segmenting the teleoperator's demonstration into subtasks and classifying the subtasks using sequence modeling. Materials and methods: A database of demonstrations for the cricothyroidotomy task was collected, comprising six fundamental maneuvers referred to as surgemes. The dataset was collected by teleoperating a collaborative robotic platform-SuperBaxter, with modified surgical grippers. Then, two learning models are developed for processing the dataset-one for automatic segmentation of the task demonstrations into a sequence of surgemes and the second for classifying each segment into labeled surgemes. Finally, a multimodal off-policy RL with rewards learned from demonstrations was developed to learn the surgeme execution from these demonstrations. Results: The task segmentation model has an accuracy of 98.2%. The surgeme classification model using the proposed interaction features achieved a classification accuracy of 96.25% averaged across all surgemes compared to 87.08% without these features and 85.4% using a support vector machine classifier. Finally, the robot execution achieved a task success rate of 93.5% compared to baselines of behavioral cloning (78.3%) and a twin-delayed deep deterministic policy gradient with shaped rewards (82.6%). Conclusions: Results indicate that the proposed interaction features for the segmentation and classification of surgical tasks improve classification accuracy. The proposed method for learning surgemes from demonstrations exceeds popular methods for skill learning. The effectiveness of the proposed approach demonstrates the potential for future remote telemedicine on battlefields.

Autonomous navigation of catheters and guidewires in mechanical thrombectomy using inverse reinforcement learning

Autonomous navigation of catheters and guidewires in mechanical thrombectomy using inverse reinforcement learning

A Zero-Shot Reinforcement Learning Strategy for Autonomous Guidewire Navigation

Image-Guided Autonomous Guidewire Navigation in Robot-Assisted Endovascular Interventions using Reinforcement Learning

Deep reinforcement learning for guidewire navigation in coronary artery phantom

Trajectory Optimization of Robot-Assisted Endovascular Catheterization with Reinforcement Learning

Collaborative Robot-Assisted Endovascular Catheterization with Generative Adversarial Imitation Learning

Learning-Based Autonomous Navigation, Benchmark Environments and Simulation Framework for Endovascular Interventions

VesNet-RL: Simulation-based Reinforcement Learning for Real-World US Probe Navigation

Learning-based Endovascular Navigation Through the Use of Non-Rigid Registration for Collaborative Robotic Catheterization

Hierarchical deep reinforcement learning controlled three-dimensional navigation of microrobots in blood vessels

Integration of Reinforcement Learning in a Virtual Robotic Surgical Simulation

Abstract TMP61: Telerobotic Neurovascular Interventions With Magnetic Manipulation

ASAP-CORPS: A Semi-Autonomous Platform for COntact-Rich Precision Surgery

Reinforcement learning in large, structured action spaces: A simulation study of decision support for spinal cord injury rehabilitation

VascularPilot3D: Toward a 3D fully autonomous navigation for endovascular robotics

Learning automatic navigation control skills for miniature helical robots from human demonstrations

Autonomous Navigation for Robot-Assisted Intraluminal and Endovascular Procedures: A Systematic Review

Hierarchical HMM Based Learning of Navigation Primitives for Cooperative Robotic Endovascular Catheterization

Telerobotic neurovascular interventions with magnetic manipulation