A Course in Dynamic Optimization

Bar Light
2024-10-10
Abstract:These lecture notes are derived from a graduate-level course in dynamic optimization, offering an introduction to techniques and models extensively used in management science, economics, operations research, engineering, and computer science. The course emphasizes the theoretical underpinnings of discrete-time dynamic programming models and advanced algorithmic strategies for solving these models. Unlike typical treatments, it provides a proof for the principle of optimality for upper semi-continuous dynamic programming, a middle ground between the simpler countable state space case \cite{bertsekas2012dynamic}, and the involved universally measurable case \cite{bertsekas1996stochastic}. This approach is sufficiently rigorous to include important examples such as dynamic pricing, consumption-savings, and inventory management models. The course also delves into the properties of value and policy functions, leveraging classical results \cite{topkis1998supermodularity} and recent developments. Additionally, it offers an introduction to reinforcement learning, including a formal proof of the convergence of Q-learning algorithms. Furthermore, the notes delve into policy gradient methods for the average reward case, presenting a convergence result for the tabular case in this context. This result is simple and similar to the discounted case but appears to be new.
Optimization and Control,Theoretical Economics,Systems and Control
What problem does this paper attempt to address?