Average Cost Optimality of Partially Observed MDPs: Contraction of Nonlinear Filters and Existence of Optimal Solutions and Approximations
Yunus Emre Demirci,Ali Devran Kara,Serdar Yüksel
DOI: https://doi.org/10.1137/24m1643736
IF: 2.2
2024-11-06
SIAM Journal on Control and Optimization
Abstract:SIAM Journal on Control and Optimization, Volume 62, Issue 6, Page 2859-2883, December 2024. The average cost optimality is known to be a challenging problem for partially observable stochastic control, with few results available beyond the finite state, action, and measurement setup, for which somewhat restrictive conditions are available. In this paper, we present explicit and easily testable conditions for the existence of solutions to the average cost optimality equation where the state space is compact. In particular, we present a novel contraction based analysis, which, to the best of our knowledge, is new to the literature, building on recent regularity results for nonlinear filters. Beyond establishing existence, we also present several implications of our analysis that also are new to the literature: (i) robustness to incorrect priors, (ii) near optimality of policies based on quantized approximations, (iii) near optimality of policies with finite memory, and (iv) convergence in Q-learning. In addition to our main theorem, each of these represents a novel contribution for average cost criteria.
mathematics, applied,automation & control systems