Non-stationary MDP Average Model --- the Existence Ofpersistently Optimal(G,B)-Generated Policies

GUOXian-ping
DOI: https://doi.org/10.3321/j.issn:0583-1431.2000.02.012
2000-01-01
Acta Mathematica Sinica English Series
Abstract:In this paper, we consider the non-stationary MDPaverage model with countable state space and arb.iotrary action space:Using the (f,B)-generated policies of Feinberg E. A. for reference. Weput forward the (G,B)-generated policies which are the generalizationof Markov policies and (f,B)-generated policies of Feinberg E. A.. Byprobability and analysics method, we prove the existence of persistentlyoptimal (G,B)-generated policies. under weaker ergodict conditions.
What problem does this paper attempt to address?