Primal-Dual Regression Approach for Markov Decision Processes with General State and Action Spaces

Denis Belomestny,John Schoenmakers
DOI: https://doi.org/10.1137/22m1526010
IF: 2.2
2024-02-14
SIAM Journal on Control and Optimization
Abstract:SIAM Journal on Control and Optimization, Volume 62, Issue 1, Page 650-679, February 2024. We develop a regression-based primal-dual martingale approach for solving discrete time, finite-horizon MDPs. The state and action spaces may be finite or infinite (but regular enough) subsets of Euclidean space. Consequently, our method allows for the construction of tight upper and lower-biased approximations of the value functions, providing precise estimates of the optimal policy. Importantly, we prove error bounds for the estimated duality gap featuring polynomial dependence on the time horizon. Additionally, we observe sublinear dependence of the stochastic part of the error on the cardinality/dimension of the state and action spaces. From a computational perspective, our proposed method is efficient. Unlike typical duality-based methods for optimal control problems in the literature, the Monte Carlo procedures involved here do not require nested simulations.
mathematics, applied,automation & control systems
What problem does this paper attempt to address?