Continuous-Time Markov Decision Processes with State-Dependent Discount Factors

Liuer Ye,Xianping Guo
DOI: https://doi.org/10.1007/s10440-012-9669-3
IF: 1.563
2012-01-01
Acta Applicandae Mathematicae
Abstract:We consider continuous-time Markov decision processes in Polish spaces. The performance of a control policy is measured by the expected discounted reward criterion associated with state-dependent discount factors. All underlying Markov processes are determined by the given transition rates which are allowed to be unbounded , and the reward rates may have neither upper nor lower bounds . By using the dynamic programming approach , we establish the discounted reward optimality equation (DROE) and the existence and uniqueness of its solutions. Under suitable conditions, we also obtain a discounted optimal stationary policy which is optimal in the class of all randomized stationary policies. Moreover, when the transition rates are uniformly bounded, we provide an algorithm to compute (or at least to approximate) the discounted reward optimal value function as well as a discounted optimal stationary policy. Finally, we use an example to illustrate our results. Specially, we first derive an explicit and exact solution to the DROE and an explicit expression of a discounted optimal stationary policy for such an example.
What problem does this paper attempt to address?