Event-based optimization for finite-horizon total-cost markov decision processes

Yanjia Zhao,Qianchuan Zhao,Xiaohong Guan
2010-01-01
Abstract:An event-based optimization approach for finite-horizon total-cost Markov decision processes (MDPs) is developed in this paper. By responding to certain events instead of all state transitions, the potentials (ie, value functions) can be aggregated and used to derive performance sensitivity formulas which lead to the gradient-based policy optimization. A sample-path based estimation algorithm for aggregated potentials is also developed. This event-based approach provides a new point of view to solve finite-horizon MDPs with large problem scales or correlated actions. A practical application in manufacturing systems demonstrates the effectiveness and efficiency of this new approach.
What problem does this paper attempt to address?