Abstract:In this paper, we present a novel framework to synthesize robust strategies for discrete-time nonlinear systems with random disturbances that are unknown, against temporal logic specifications. The proposed framework is data-driven and abstraction-based: leveraging observations of the system, our approach learns a high-confidence abstraction of the system in the form of an uncertain Markov decision process (UMDP). The uncertainty in the resulting UMDP is used to formally account for both the error in abstracting the system and for the uncertainty coming from the data. Critically, we show that for any given state-action pair in the resulting UMDP, the uncertainty in the transition probabilities can be represented as a convex polytope obtained by a two-layer state discretization and concentration inequalities. This allows us to obtain tighter uncertainty estimates compared to existing approaches, and guarantees efficiency, as we tailor a synthesis algorithm exploiting the structure of this UMDP. We empirically validate our approach on several case studies, showing substantially improved performance compared to the state-of-the-art.

What problem does this paper attempt to address?

The key problem that this paper attempts to solve is how to synthesize robust control strategies for nonlinear stochastic systems under unknown random perturbations to meet temporal logic specifications. Specifically: 1. **Problem Background**: - In fields such as robotics, self - driving cars, and cyber - physical systems, ensuring the safe operation of stochastic systems is crucial. - When the system dynamics contain unknown random perturbations, it is difficult to handle these uncertainties while guaranteeing performance, especially when facing complex high - level specifications. 2. **Limitations of Existing Methods**: - Existing methods usually assume that the perturbation distribution is known or rely on overly conservative abstraction methods, which limit their scalability and applicability in complex systems. 3. **Objectives of the Paper**: - Propose a new framework for synthesizing optimal strategies for nonlinear stochastic systems under unknown random perturbations, ensuring formal guarantees and computational efficiency. - Construct an uncertain Markov decision process (UMDP) abstraction through a data - driven approach, thereby reducing conservatism and improving the accuracy of the results. 4. **Specific Contributions**: - Propose a novel framework for synthesizing strategies that satisfy linear temporal logic (LTLf) specifications for nonlinear stochastic systems under unknown non - additive perturbations. - Propose a distribution - free data - driven method to construct UMDP abstractions with a specific structure, reducing the conservatism of existing abstraction techniques. - Design an efficient strategy - synthesis algorithm that utilizes the structure of UMDP without introducing additional conservatism. - Through a series of case studies and benchmark tests, demonstrate the advantages of this framework over existing methods, with a three - order - of - magnitude improvement in sample complexity and a one - order - of - magnitude reduction in computation time. 5. **Main Challenges**: - How to generate a control strategy that can satisfy a given LTLf formula with high probability using a limited number of data samples in the case of an unknown perturbation distribution. - The synthesized strategy must take into account the learning gap due to the lack of knowledge of the perturbation distribution P. 6. **Solutions**: - Utilize a two - layer discretization scheme and concentration inequalities to represent the transition probability uncertainty in UMDP as a convex polyhedron, thereby obtaining a tighter uncertainty estimate. - Maximize the probability of satisfying the LTLf formula on UMDP through a robust dynamic programming algorithm, and refine the resulting strategy into the original system to ensure that the probability that the closed - loop system satisfies the formula φ is higher than that in the abstract model. In summary, this paper aims to solve the control problem of nonlinear stochastic systems under unknown perturbations through innovative UMDP abstractions and strategy - synthesis algorithms, providing efficient and robust solutions.

Temporal Logic Control for Nonlinear Stochastic Systems Under Unknown Disturbances

Data-Driven Strategy Synthesis for Stochastic Systems with Unknown Nonlinear Disturbances

Robust Dynamic Programming for Temporal Logic Control of Stochastic Systems

Efficient Strategy Synthesis for Switched Stochastic Systems with Distributional Uncertainty

Piecewise output feedback control for affine systems with disturbances based on linear temporal logic specifications

Strategy synthesis for partially-known switched stochastic systems

Specification-guided temporal logic control for stochastic systems: a multi-layered approach

Temporal Robustness of Temporal Logic Specifications: Analysis and Control Design

On synthesizing robust discrete controllers under modeling uncertainty

Control of Probabilistic Systems under Dynamic, Partially Known Environments with Temporal Logic Specifications

Robust Control for Dynamical Systems with Non-Gaussian Noise via Formal Abstractions

Distributionally Robust Control for Chance-Constrained Signal Temporal Logic Specifications

Scalable control synthesis for stochastic systems via structural IMDP abstractions

Learning Optimal Strategies for Temporal Tasks in Stochastic Games

Reinforcement Learning for Temporal Logic Control Synthesis with Probabilistic Satisfaction Guarantees

Data-Driven Distributionally Robust System Level Synthesis

Correct-by-Construction Control for Stochastic and Uncertain Dynamical Models via Formal Abstractions

Probabilities Are Not Enough: Formal Controller Synthesis for Stochastic Dynamical Models with Epistemic Uncertainty

Signal Temporal Logic Control Synthesis among Uncontrollable Dynamic Agents with Conformal Prediction

Probabilistic Tube-based Control Synthesis of Stochastic Multi-Agent Systems under Signal Temporal Logic

Reinforcement Learning with Temporal Logic Constraints for Partially-Observable Markov Decision Processes