Average Optimality for Unbounded Rewards

Xianping Guo,Onésimo Hernández-Lerma
DOI: https://doi.org/10.1007/978-3-642-02547-1_7
2009-01-01
Abstract: In Chap. 7, we study the EAR criterion for the same MDP model as in Chap. 6. After briefly introducing some basic facts in Sect. 7.2, we establish the average reward optimality equation and the existence of EAR optimal policies in Sect. 7.3. In Sect. 7.4, we provide a policy iteration algorithm for computing or at least approximating an EAR optimal policy. Finally, we illustrate the results in this chapter with several examples in Sect. 7.5.
What problem does this paper attempt to address?