Safe Navigation for Robotic Digestive Endoscopy via Human Intervention-based Reinforcement Learning

Min Tan,Yushun Tao,Boyun Zheng,GaoSheng Xie,Lijuan Feng,Zeyang Xia,Jing Xiong

2024-09-24

Abstract:With the increasing application of automated robotic digestive endoscopy (RDE), ensuring safe and efficient navigation in the unstructured and narrow digestive tract has become a critical challenge. Existing automated reinforcement learning navigation algorithms, often result in potentially risky collisions due to the absence of essential human intervention, which significantly limits the safety and effectiveness of RDE in actual clinical practice. To address this limitation, we proposed a Human Intervention (HI)-based Proximal Policy Optimization (PPO) framework, dubbed HI-PPO, which incorporates expert knowledge to enhance RDE's safety. Specifically, we introduce an Enhanced Exploration Mechanism (EEM) to address the low exploration efficiency of the standard PPO. Additionally, a reward-penalty adjustment (RPA) is implemented to penalize unsafe actions during initial interventions. Furthermore, Behavior Cloning Similarity (BCS) is included as an auxiliary objective to ensure the agent emulates expert actions. Comparative experiments conducted in a simulated platform across various anatomical colon segments demonstrate that our model effectively and safely guides RDE.

Robotics,Artificial Intelligence

What problem does this paper attempt to address?

The paper aims to address the issue of safe and efficient navigation of Robotic Digestive Endoscopes (RDE) in unstructured and narrow digestive tracts. Currently, automated navigation algorithms based on reinforcement learning may lead to potential collision risks in actual clinical applications due to the lack of necessary human intervention, severely affecting the safety and effectiveness of RDE. To overcome this limitation, the authors propose a Proximal Policy Optimization (PPO) framework based on human intervention, called HI-PPO. This framework enhances the safety of RDE by incorporating expert knowledge and specifically proposes the following mechanisms: 1. **Enhanced Exploration Mechanism (EEM)**: Improves the exploration efficiency of standard PPO. 2. **Reward-Penalty Adjustment (RPA)**: Promotes safer policy learning by penalizing unsafe behaviors during initial interventions. 3. **Behavior Cloning Similarity (BCS)**: Ensures that the agent can mimic expert actions, improving learning performance in complex environments. Experimental results show that the HI-PPO method can effectively guide RDE in various anatomical colon segments, significantly reducing the number of collisions and improving safety. This indicates that the method not only enhances accuracy but also provides a safer navigation solution in complex surgical environments.

Safe Navigation for Robotic Digestive Endoscopy via Human Intervention-based Reinforcement Learning

Predictive Hierarchical Reinforcement Learning for Path-Efficient Mapless Navigation with Moving Target.

Learning Observation-Based Certifiable Safe Policy for Decentralized Multi-Robot Navigation

Deep Reinforcement Learning with Enhanced PPO for Safe Mobile Robot Navigation

Look Before You Leap: Safe Model-Based Reinforcement Learning with Human Intervention

Constrained Reinforcement Learning and Formal Verification for Safe Colonoscopy Navigation

Subgoal-Driven Navigation in Dynamic Environments Using Attention-Based Deep Reinforcement Learning

End-to-End Autonomous Navigation Based on Deep Reinforcement Learning with a Survival Penalty Function

Deep Reinforcement Learning-Based Control for Stomach Coverage Scanning of Wireless Capsule Endoscopy

A Dynamic Safety Shield for Safe and Efficient Reinforcement Learning of Navigation Tasks

Navigation of Mobile Robots Based on Deep Reinforcement Learning: Reward Function Optimization and Knowledge Transfer

Goal-Driven Autonomous Exploration Through Deep Reinforcement Learning

Safe Policy Exploration Improvement via Subgoals

A safe reinforcement learning approach for autonomous navigation of mobile robots in dynamic environments

Research on Autonomous Robots Navigation based on Reinforcement Learning

SafeCrowdNav: safety evaluation of robot crowd navigation in complex scenes

Enhancing Navigational Safety in Crowded Environments using Semantic-Deep-Reinforcement-Learning-based Navigation

Learning Navigation Policies for Mobile Robots in Deep Reinforcement Learning with Random Network Distillation

Deep reinforcement learning based mapless navigation for industrial AMRs: advancements in generalization via potential risk state augmentation

Autonomous navigation of mobile robots in unknown environments using off-policy reinforcement learning with curriculum learning

Addressing unpredictable movements of dynamic obstacles with deep reinforcement learning to ensure safe navigation for omni-wheeled mobile robot