Abstract:Formal verification provides a high degree of confidence in safe system operation, but only if reality matches the verified model. Although a good model will be accurate most of the time, even the best models are incomplete. This is especially true in Cyber-Physical Systems because high-fidelity physical models of systems are expensive to develop and often intractable to verify. Conversely, reinforcement learning-based controllers are lauded for their flexibility in unmodeled environments, but do not provide guarantees of safe operation. This paper presents an approach for provably safe learning that provides the best of both worlds: the exploration and optimization capabilities of learning along with the safety guarantees of formal verification. Our main insight is that formal verification combined with verified runtime monitoring can ensure the safety of a learning agent. Verification results are preserved whenever learning agents limit exploration within the confounds of verified control choices as long as observed reality comports with the model used for off-line verification. When a model violation is detected, the agent abandons efficiency and instead attempts to learn a control strategy that guides the agent to a modeled portion of the state space. We prove that our approach toward incorporating knowledge about safe control into learning systems preserves safety guarantees, and demonstrate that we retain the empirical performance benefits provided by reinforcement learning. We also explore various points in the design space for these justified speculative controllers in a simple model of adaptive cruise control model for autonomous cars.

Formal Control Synthesis Via Safe Reinforcement Learning under Real-Time Specifications

Joint Synthesis of Safety Certificate and Safe Control Policy Using Constrained Reinforcement Learning

Formal synthesis of controllers for safety-critical autonomous systems: Developments and challenges

Safeguarding Learning-based Control for Smart Energy Systems with Sampling Specifications

Safe Reinforcement Learning via Formal Methods: Toward Safe Control Through Proof and Learning

Learning-Based Synthesis of Safety Controllers

Safe Barrier-Constrained Control of Uncertain Systems via Event-triggered Learning

Implicit Safe Set Algorithm for Provably Safe Reinforcement Learning

End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks

Shielded Reinforcement Learning for Hybrid Systems

Temporal Logic Guided Safe Reinforcement Learning Using Control Barrier Functions

Tunable Reactive Synthesis for Lipschitz-Bounded Systems with Temporal Logic Specifications

Reinforcement Learning for Temporal Logic Control Synthesis with Probabilistic Satisfaction Guarantees

Scalable Synthesis of Verified Controllers in Deep Reinforcement Learning

Synthesis of Temporally-Robust Policies for Signal Temporal Logic Tasks using Reinforcement Learning

Safe Controller for Output Feedback Linear Systems using Model-Based Reinforcement Learning

Synthesize Efficient Safety Certificates for Learning-Based Safe Control Using Magnitude Regularization

Reinforcement Learning with Temporal Logic Constraints for Partially-Observable Markov Decision Processes

Safe Model-Based Reinforcement Learning for Systems with Parametric Uncertainties

Model-free PAC Time-Optimal Control Synthesis with Reinforcement Learning