Dynamic Programming: From Local Optimality to Global Optimality

John Stachurski,Jingni Yang,Ziyue Yang
2025-01-21
Abstract:In the theory of dynamic programming, an optimal policy is a policy whose lifetime value dominates that of all other policies from every possible initial condition in the state space. This raises a natural question: when does optimality from a single state imply optimality from every state? We show that, in a general setting, irreducibility of the transition kernel is sufficient for this property. Our results have important implications for modern policy-based algorithms used to solve large-scale dynamic programs in reinforcement learning and other fields.
Optimization and Control
What problem does this paper attempt to address?