Abstract:Recently, safe reinforcement learning (RL) with the actor-critic structure for continuous control tasks has received increasing attention. It is still challenging to learn a near-optimal control policy with safety and convergence guarantees. Also, few works have addressed the safe RL algorithm design under time-varying safety constraints. This paper proposes a safe RL algorithm for optimal control of nonlinear systems with time-varying state and control constraints. In the proposed approach, we construct a novel barrier force-based control policy structure to guarantee control safety. A multi-step policy evaluation mechanism is proposed to predict the policy's safety risk under time-varying safety constraints and guide the policy to update safely. Theoretical results on stability and robustness are proven. Also, the convergence of the actor-critic implementation is analyzed. The performance of the proposed algorithm outperforms several state-of-the-art RL algorithms in the simulated Safety Gym environment. Furthermore, the approach is applied to the integrated path following and collision avoidance problem for two real-world intelligent vehicles. A differential-drive vehicle and an Ackermann-drive one are used to verify offline deployment and online learning performance, respectively. Our approach shows an impressive sim-to-real transfer capability and a satisfactory online control performance in the experiment.

A Time-Aggregated Model-Free RL Algorithm for Optimal Containment Control of MASs

Safety-critical Containment Control for Multi-Agent Systems with Communication Delays

Distributed Fault-Tolerant Containment Control Protocols for the Discrete-Time Multiagent Systems via Reinforcement Learning Method

Optimal Control for Constrained Discrete-Time Nonlinear Systems Based on Safe Reinforcement Learning.

Event-triggered optimal containment control for multi-agent systems subject to state constraints via reinforcement learning

Optimized Backstepping-Based Containment Control for Multiagent Systems With Deferred Constraints Using a Universal Nonlinear Transformation

Reinforcement Learning-Based Event-Triggered Constrained Containment Control for Perturbed Multiagent Systems

Event-Triggered Containment Control for Nonlinear Multiagent Systems Via Reinforcement Learning

Model-Based Safe Reinforcement Learning With Time-Varying Constraints: Applications to Intelligent Vehicles

Data-Driven Fault-Tolerant Reinforcement Learning Containment Control for Nonlinear Multiagent Systems

Internal reinforcement adaptive dynamic programming for optimal containment control of unknown continuous-time multi-agent systems

Model-Based Safe Reinforcement Learning with Time-Varying State and Control Constraints: An Application to Intelligent Vehicles

Adaptive resilient containment control using reinforcement learning for nonlinear stochastic multi-agent systems under sensor faults

Finite-time adaptive optimal consensus control for multi-agent systems subject to time-varying output constraints

Containment control for second-order multi-agent systems with time-varying delays via variable-augmented-based free-weighting matrices

Simplified optimized finite-time containment control for a class of multi-agent systems with actuator faults

Optimal Tracking Control of Nonlinear Multiagent Systems Using Internal Reinforce Q-Learning

Event-based adaptive fixed-time optimal control for saturated fault-tolerant nonlinear multiagent systems via reinforcement learning algorithm

Data-Efficient Off-Policy Learning for Distributed Optimal Tracking Control of HMAS with Unidentified Exosystem Dynamics.

Practical Optimal Formation-Containment Tracking Control of Nonlinear Multiagent Systems With Unknown Dynamics

Designing Observer-Type Controller for Containment of Discrete-Time Linear MASs over Signed Graph