On controlled Markov chains with optimality requirement and safety constraint

S. Hsu,A. Arapostathis,Ratnesh Kumar,許舜斌
2010-06-01
Abstract:We study the control of completely observed Markov chains subject to generalized safety bounds and optimality requirement. Originally, the safety bounds were specified as unit-interval valued vector pairs (lower and upper bounds for each component of the state probability distribution). In this paper, we generalize the constraint to be any linear convex set for the distribution to stay in, and present a way to compute a stationary control policy which is safe and at the same time long-run average optimal. This policy guarantees the safety of the system as it is on its ’limiting status’, and is derived through a linear programming formulation with its feasibility problem explored. To assure the safety of the system’s transient behavior under the policy assumed to induce a unique limiting distribution in the interior of the constraint set, we present a finitely-terminating iterative algorithm to compute the maximal invariant safe set (MISS) such that starting from which any initial distribution incurs a sequence of future distributions that are safe also. A theoretic upper bound for the number of iterations is provided. Furthermore, a simplified algorithm that might require less calculation is also introduced and illustrated in numerical examples. In particular, we obtain the closed-form representation for the MISS of two-state system based on at most one iteration of the algorithm.
What problem does this paper attempt to address?