Semi-Markov Decision Processes with Variance Minimization Criterion

Qingda Wei,Xianping Guo
DOI: https://doi.org/10.1007/s10288-014-0267-2
2014-01-01
4OR
Abstract:We consider a variance minimization problem for semi-Markov decision processes with state-dependent discount factors in Borel spaces. The reward function may be unbounded from below and from above. Under suitable conditions, we first prove that the discount variance minimization criterion can be transformed into an equivalent expected discount criterion, and then show the existence of a discount variance minimal policy over the class of expected discount optimal stationary policies. Furthermore, we also give a value iteration algorithm for calculating the expected discount optimal value function. Finally, two examples are used to illustrate our results.
What problem does this paper attempt to address?