Average Optimality in Markov Decision Processes with Unbounded Rewards

胡奇英
DOI: https://doi.org/10.3969/j.issn.1007-6093.2002.01.001
2002-01-01
Abstract:This paper studies average optimality in Markov decision processes with countablestate space, nonempty action sets and unbounded reward function. New conditions arediscussed under which there exists an (ε) optimal stationary policy, and that the averagecriterion optimality inequality holds when the summation in it is well defined.
What problem does this paper attempt to address?