Online Adaptive Optimization Algorithm for Semi-Markov Control Processes

Jiang Qi,Xi Hongsheng,Yin Baoqun
DOI: https://doi.org/10.1109/chicc.2006.280525
2006-01-01
Abstract:Semi-Markov control problems with unknown kernel are considered, a reinforcement learning based online adaptive optimization algorithm is proposed. First an event-driven stochastic switching model is introduced to formulate the semi-Markov control problems. Then by utilizing the features of event-driven policy an optimization algorithm that combines policy gradient estimation and stochastic approximation is derived. This algorithm can converge to global optimization without the explicit knowledge of the semi-Markov kernel. Moreover, this algorithm does not require the computation of performance potentials or other related quantities (e.g. Q-factors) and therefore saves computational cost significantly. Simulation results demonstrate the effectiveness of the proposed algorithm.
What problem does this paper attempt to address?