Abstract:Reinforcement Learning (RL) has demonstrated remarkable success across various domains. Nonetheless, a significant challenge in RL is to ensure safety, particularly when deploying it in safety-critical applications such as robotics and autonomous driving. In this work, we develop a robust and safe RL methodology grounded in manifold space. Initially, we construct a constrained manifold space, taking safety constraints into consideration. We then propose a robust safe RL approach, supported by theoretical analysis, based on the value at risk and conditional value at risk, in order to enhance the robustness of safety. Our methodology is designed to ensure safety within stochastic constraint environments. Following the theoretical analysis, we develop a practical, safe algorithm to search for a robust safe policy on stochastic constraint manifolds (ROSCOM). We evaluate the effectiveness of our approach through circular motion and air-hockey tasks. Our experiments demonstrate that ROSCOM outperforms existing baselines in terms of both reward and safety. Note to Practitioners-Real-world applications often involve inherent uncertainties, noise, and high-dimensional spaces. This complexity accentuates the urgency and challenge of ensuring safety in robot learning, especially when implementing RL in practical environments. To address this critical issue, we build a stochastic constraint manifold to delineate the safety space, thus establishing a rigorous framework for robot learning at each iteration. Compared with state-of-the-art baselines, our method can provide remarkable performance regarding safety and reward performance. For example, in an air hockey robot learning task, our method has demonstrated a remarkable $50\%$ enhancement in safety performance compared to the ATACOM framework, while concurrently exhibiting superior reward performance. Moreover, in contrast to traditional algorithms, including CPO, PCPO, our method has achieved a 99% improvement in safety performance, coupled with significantly superior reward performance. These empirical insights render our approach not only theoretically sound but also practically efficacious, indicating its potential as a useful tool in real robot learning and beyond.

A Stable Actor-Critic Algorithm for Solving Robotic Tasks with Multiple Constraints

Multi-Robot Coordination In Complex Environment With Task And Communication Constraints

ROSCOM: Robust Safe Reinforcement Learning on Stochastic Constraint Manifolds

A Single-Loop Deep Actor-Critic Algorithm for Constrained Reinforcement Learning with Provable Convergence

Model-Based Actor-Critic Learning for Optimal Tracking Control of Robots with Input Saturation.

Task-Oriented Deep Reinforcement Learning for Robotic Skill Acquisition and Control

Solving Stabilize-Avoid Optimal Control via Epigraph Form and Deep Reinforcement Learning

An advantage actor-critic algorithm for robotic motion planning in dense and dynamic scenarios

Actor-Critic Reinforcement Learning for Control With Stability Guarantee

Trust the PRoC3S: Solving Long-Horizon Robotics Problems with LLMs and Constraint Satisfaction

A Novel Hierarchical Soft Actor-Critic Algorithm for Multi-Logistics Robots Task Allocation.

RMBench: Benchmarking Deep Reinforcement Learning for Robotic Manipulator Control

Research On Actor-Critic Reinforcement Learning In Robocup

Solver-Critic: A Reinforcement Learning Method for Discrete-Time-Constrained-Input Systems

Model-based Actor-critic Learning of Robotic Impedance Control in Complex Interactive Environment

Applying Online Expert Supervision in Deep Actor-Critic Reinforcement Learning.

Multi-Objective Combinatorial Optimization Algorithm Based on Asynchronous Advantage Actor–Critic and Graph Transformer Networks

Multi-Robot Real-time Game Strategy Learning Based on Deep Reinforcement Learning.

State-wise Constrained Policy Optimization

Action Constrained Deep Reinforcement Learning Based Safe Automatic Driving Method

A Task-Adaptive Deep Reinforcement Learning Framework for Dual-Arm Robot Manipulation