Dynamic Programming: Optimality at a Point Implies Optimality Everywhere

John Stachurski,Jingni Yang,Ziyue Yang
2024-11-17
Abstract:In the theory of dynamic programming, an optimal policy is a policy whose lifetime value dominates that of all other policies at every point in the state space. This raises a natural question: under what conditions does optimality at a single state imply optimality at every state? We show that, in a general setting, the irreducibility of the transition kernel under a feasible policy is a sufficient condition for extending optimality from one state to all states. These results have important implications for dynamic optimization algorithms based on gradient methods, which are routinely applied in reinforcement learning and other large scale applications.
Optimization and Control
What problem does this paper attempt to address?