Q-learning as a monotone scheme

Lingyi Yang
2024-05-31
Abstract:Stability issues with reinforcement learning methods persist. To better understand some of these stability and convergence issues involving deep reinforcement learning methods, we examine a simple linear quadratic example. We interpret the convergence criterion of exact Q-learning in the sense of a monotone scheme and discuss consequences of function approximation on monotonicity properties.
Machine Learning
What problem does this paper attempt to address?