Abstract:We consider continuous-time Markov decision processes in Polish spaces. The performance of a control policy is measured by the expected discounted reward criterion associated with state-dependent discount factors. All underlying Markov processes are determined by the given transition rates which are allowed to be unbounded , and the reward rates may have neither upper nor lower bounds . By using the dynamic programming approach , we establish the discounted reward optimality equation (DROE) and the existence and uniqueness of its solutions. Under suitable conditions, we also obtain a discounted optimal stationary policy which is optimal in the class of all randomized stationary policies. Moreover, when the transition rates are uniformly bounded, we provide an algorithm to compute (or at least to approximate) the discounted reward optimal value function as well as a discounted optimal stationary policy. Finally, we use an example to illustrate our results. Specially, we first derive an explicit and exact solution to the DROE and an explicit expression of a discounted optimal stationary policy for such an example.

Markov Decision Processes with State-Dependent Discount Factors and Unbounded Rewards/costs.

Continuous Time Markov Decision Processes with Expected Discounted Total Rewards

Continuous-Time Markov Decision Processes with State-Dependent Discount Factors

Denumerable-state Continuous-Time Markov Decision Processes with Unbounded Transition and Reward Rates under the Discounted Criterion

Markov Decision Problems with Unbounded Transition Rates under Discounted-Cost Performance Criteria

Continuous-Time Markov Decision Processes with Unbounded Transition and Discounted-Reward Rates

Risk-Sensitive Discounted Continuous-Time Markov Decision Processes with Unbounded Rates.

Continuous Time Markov Decision Processes with Nonuniformly Bounded Transition Rate: Expected Total Rewards

Risk-sensitive discounted Markov decision processes with unbounded reward functions and Borel spaces

Risk-sensitive Infinite-Horizon Discounted Piecewise Deterministic Markov Decision Processes

Continuous-Time Markov Decision Processes with Discounted Rewards: the Case of Polish Spaces

Semi-Markov Decision Processes with Variance Minimization Criterion

The Finiteness of the Reward Function and the Optimal Value Function in Markov Decision Processes

The Risk Probability Criterion for Discounted Continuous-Time Markov Decision Processes

First Passage Optimality for Continuous-Time Markov Decision Processes with Varying Discount Factors and History-Dependent Policies

Markov Decision Processes with Time-Varying Geometric Discounting

Discounted Optimality for Continuous-Time Markov Decision Processes in Polish Spaces

Continuous Time Markov Decision Processes with Discounted Moment Criterion

Finite-horizon Optimality for Continuous-Time Markov Decision Processes with Unbounded Transition Rates

Risk-sensitive Continuous-Time Markov Decision Processes with Unbounded Rates and Borel Spaces

Convergence of Markov Decision Processes with Constraints and State-Action Dependent Discount Factors