Lifted-Rollout for Approximate Policy Iteration of Markov Decision Process

Wang-Zhou Dai,Yang Yu,Zhi-Hua Zhou
DOI: https://doi.org/10.1109/ICDMW.2011.112
2011-01-01
Abstract:Sampling-based approximate policy iteration, which samples (or "rollout") the current policy and find improvement from the samples, is an efficient and practical approach for solving policies in Markov decision process. Such an approach, however, suffers from the inherent variance of sampling. In this paper, we propose the lifted-rollout approach. This approach models the decision process using a directed a cyclic graph and then lifts the possibly huge graph by compressing similar nodes. Finally the approximate policy is obtained by inference on the lifted graph. Experiments show that our approach avoids the sampling variance and achieves significantly better performance.
What problem does this paper attempt to address?