Abstract:This paper studies the minimizing risk problems in Markov decision processes with countable state space and reward set. The objective is to find a policy which minimizes the probability (risk) that the total discounted rewards do not exceed a specified value (target). In this sort of model, the decision made by the decision maker depends not only on system's states, but also on his target values. By introducing the decision-maker's state, we formulate a framework for minimizing risk models. The policies discussed depend on target values and the rewards may be arbitrary real numbers. For the finite horizon model, the main results obtained are: (i) The optimal value functions are distribution functions of the target, (ii) there exists an optimal deterministic Markov policy, and (iii) a policy is optimal if and only if at each realizable state it always takes optimal action. In addition, we obtain a sufficient condition and a necessary condition for the existence of finite horizon optimal policy independent of targets and we give an algorithm computing finite horizon optimal policies and optimal value functions. For an infinite horizon model, we establish the optimality equation and we obtain the structure property of optimal policy. We prove that the optimal value function is a distribution function of target and we present a new approximation formula which is the generalization of the nonnegative rewards cases. An example which illustrates the mistakes of previous literature shows that the existence of optimal policy has not been proved really. In this paper, we give an existence condition, which is a sufficient and necessary condition for the existence of an infinite horizon optimal policy independent of targets, and we point out that whether there exists an optimal policy remains an open problem in the general case.

Strategy Complexity of Limsup and Liminf Threshold Objectives in Countable MDPs, with Applications to Optimal Expected Payoffs

The Power of Counting Steps in Quantitative Games

Strategy Complexity of Parity Objectives in Countable MDPs

Strategy Complexity of Reachability in Countable Stochastic 2-Player Games

Strategy Complexity of Büchi Objectives in Concurrent Stochastic Games

Parity Objectives in Countable MDPs

Expectation in Stochastic Games with Prefix-independent Objectives

Finite-memory Strategies for Almost-sure Energy-MeanPayoff Objectives in MDPs

Multidimensional beyond worst-case and almost-sure problems for mean-payoff objectives

Unbounded Cost Markov Decision Processes with Limsup and Liminf Average Criteria: New Conditions

Bounded-Memory Strategies in Partial-Information Games

Positivity-hardness results on Markov decision processes

Markov Decision Processes with Sure Parity and Multiple Reachability Objectives

Finitely additive behavioral strategies: when do they induce an unambiguous expected payoff?

Universal Complexity Bounds Based on Value Iteration for Stochastic Mean Payoff Games and Entropy Games

Finite-horizon Optimality for Continuous-Time Markov Decision Processes with Unbounded Transition Rates

Discounted Continuous-Time Markov Decision Processes with Constraints: Unbounded Transition and Loss Rates

Minimizing Risk Models in Markov Decision Processes with Policies Depending on Target Values

Regret Bounds for Risk-sensitive Reinforcement Learning with Lipschitz Dynamic Risk Measures

Reachability and Safety Objectives in Markov Decision Processes on Long but Finite Horizons