Strategy Improvement for Concurrent Reachability and Safety Games
Krishnendu Chatterjee,Luca de Alfaro,Thomas A. Henzinger
DOI: https://doi.org/10.48550/arXiv.1201.2834
2012-01-13
Computer Science and Game Theory
Abstract:We consider concurrent games played on graphs. At every round of a game, each player simultaneously and independently selects a move; the moves jointly determine the transition to a successor state. Two basic objectives are the safety objective to stay forever in a given set of states, and its dual, the reachability objective to reach a given set of states. First, we present a simple proof of the fact that in concurrent reachability games, for all $\epsilon>0$, memoryless $\epsilon$-optimal strategies exist. A memoryless strategy is independent of the history of plays, and an $\epsilon$-optimal strategy achieves the objective with probability within $\epsilon$ of the value of the game. In contrast to previous proofs of this fact, our proof is more elementary and more combinatorial. Second, we present a strategy-improvement (a.k.a.\ policy-iteration) algorithm for concurrent games with reachability objectives. We then present a strategy-improvement algorithm for concurrent games with safety objectives. Our algorithms yield sequences of player-1 strategies which ensure probabilities of winning that converge monotonically to the value of the game. Our result is significant because the strategy-improvement algorithm for safety games provides, for the first time, a way to approximate the value of a concurrent safety game from below. Previous methods could approximate the values of these games only from one direction, and as no rates of convergence are known, they did not provide a practical way to solve these games.