Average Optimality for Finite Models

Xianping Guo,Onésimo Hernández-Lerma
DOI: https://doi.org/10.1007/978-3-642-02547-1_3
2009-01-01
Abstract: Chapter 3 deals with finite models, that is, continuous-time MDPs with a finite number of states and actions. The long-run expected average reward (AR) criterion and the n-bias (n=0,1,…) optimality criteria are introduced in Sect. 3.2. (Occasionally, we abbreviate expected average reward as EAR rather than expected AR.) For every n=0,1,…, formulas expressing the difference between the n-biases for any two policies are provided in Sect. 3.3. These formulas are used in Sect. 3.4 to characterize n-bias optimal policies. The policy iteration and the linear programming algorithms for computing optimal policies for each of the n-bias criteria are given in Sects. 3.5 and 3.6, respectively.
What problem does this paper attempt to address?