Feasibility Consistent Representation Learning for Safe Reinforcement Learning

Zhepeng Cen,Yihang Yao,Zuxin Liu,Ding Zhao

2024-06-13

Abstract:In the field of safe reinforcement learning (RL), finding a balance between satisfying safety constraints and optimizing reward performance presents a significant challenge. A key obstacle in this endeavor is the estimation of safety constraints, which is typically more difficult than estimating a reward metric due to the sparse nature of the constraint signals. To address this issue, we introduce a novel framework named Feasibility Consistent Safe Reinforcement Learning (FCSRL). This framework combines representation learning with feasibility-oriented objectives to identify and extract safety-related information from the raw state for safe RL. Leveraging self-supervised learning techniques and a more learnable safety metric, our approach enhances the policy learning and constraint estimation. Empirical evaluations across a range of vector-state and image-based tasks demonstrate that our method is capable of learning a better safety-aware embedding and achieving superior performance than previous representation learning baselines.

Machine Learning

What problem does this paper attempt to address?

This paper focuses on a core challenge in secure reinforcement learning (RL): how to optimize reward performance while satisfying safety constraints. The authors propose a new framework called Feasibility Consistent Secure Reinforcement Learning (FCSRL). This framework combines representation learning and feasibility-oriented objectives to identify and extract safety-related information from raw states, improving policy learning and constraint estimation in RL. In traditional RL, estimating safety is more difficult than estimating rewards due to the sparsity of constraint signals, leading to inaccurate estimation of safety constraints. FCSRL enhances policy learning by leveraging self-supervised learning techniques and more tractable safety metrics to address this problem. The paper demonstrates through a series of experiments on vector state and image-based tasks that FCSRL can learn better safety-aware embeddings and outperform previous representation learning baselines, especially under stricter constraint conditions. Furthermore, the paper introduces a novel learning objective called feasibility score, which exhibits smoother properties than other cost metrics. It serves as an auxiliary task for representation learning to enhance the precision of safety-contextual features, thereby finding a better balance between reward maximization and meeting safety constraints.

Feasibility Consistent Representation Learning for Safe Reinforcement Learning

Feasible Actor-Critic: Constrained Reinforcement Learning for Ensuring Statewise Safety

Iterative Reachability Estimation for Safe Reinforcement Learning

Reachability Constrained Reinforcement Learning.

Feasible Policy Iteration

State-Wise Safe Reinforcement Learning With Pixel Observations

Safe Reinforcement Learning via Hierarchical Adaptive Chance-Constraint Safeguards

Safe Model-Based Reinforcement Learning with an Uncertainty-Aware Reachability Certificate

Constrained reinforcement learning with statewise projection: a control barrier function approach

The Feasibility of Constrained Reinforcement Learning Algorithms: A Tutorial Study

A Survey of Constraint Formulations in Safe Reinforcement Learning

Context-Aware Safe Reinforcement Learning for Non-Stationary Environments

Safe Reinforcement Learning with Free-form Natural Language Constraints and Pre-Trained Language Models

Synthesizing Control Barrier Functions with Feasible Region Iteration for Safe Reinforcement Learning

Evaluating Model-free Reinforcement Learning Toward Safety-critical Tasks

Safe Offline Reinforcement Learning with Feasibility-Guided Diffusion Model

Safe reinforcement learning for probabilistic reachability and safety specifications: A Lyapunov-based approach

Safety-Aware Causal Representation for Trustworthy Offline Reinforcement Learning in Autonomous Driving

State-wise Safe Reinforcement Learning: A Survey

Safe Reinforcement Learning Using Robust Control Barrier Functions

Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments