Constrained Semi-Markov Decision Processes with Ratio and Time Expected Average Criteria in Polish Spaces

Qingda Wei,Xianping Guo
DOI: https://doi.org/10.1080/02331934.2013.860686
IF: 2.2
2013-01-01
Optimization
Abstract:This paper deals with the ratio and time expected average criteria for constrained semi-Markov decision processes (SMDPs). The state and action spaces are Polish spaces, the rewards and costs are unbounded from above and from below, and the mean holding times are allowed to be unbounded from above. First, under general conditions we prove the existence of constrained-optimal policies for the ratio expected average criterion by developing a technique of occupation measures including the mean holding times for SMDPs, which are the generalizations of those for the standard discrete-time and continuous-time MDPs. Then, we give suitable conditions under which we establish the equivalence of the two average criteria by the optional sampling theorem, and thus we show the existence of constrained-optimal policies for the time expected average criterion. Finally, we illustrate the application of our main results with a controlled linear system, for which an exact optimal policy is obtained.
What problem does this paper attempt to address?