Bounded Synthesis and Reinforcement Learning of Supervisors for Stochastic Discrete Event Systems With LTL Specifications

Ryohei Oura,Toshimitsu Ushio,Ami Sakakibara
DOI: https://doi.org/10.1109/tac.2024.3376723
IF: 6.549
2024-01-01
IEEE Transactions on Automatic Control
Abstract:In this paper, we consider supervisory control of stochastic discrete event systems (SDESs) under linear temporal logic specifications. Applying the bounded synthesis, we reduce the supervisor synthesis into a problem of satisfying a safety condition. First, we consider a directed controller that allows at most one controllable event to be enabled. We assign a negative reward to the unsafe states and introduce an expected return with a state-dependent discount factor. We compute a winning region and a directed controller with the maximum satisfaction probability using a dynamic programming method, where the expected return is used as a value function. Next, we construct a permissive supervisor via the optimal value function. We show that the supervisor accomplishes the maximum satisfaction probability and maximizes the reachable set within the winning region. Finally, for an unknown SDES, we propose a two-stage model-free reinforcement learning method for efficient learning of the winning region and the directed controllers with the maximum satisfaction probability. We also demonstrate the effectiveness of the proposed method by simulation.
automation & control systems,engineering, electrical & electronic
What problem does this paper attempt to address?