Abstract:This paper concerns quasi-stochastic approximation (QSA) to solve root finding problems commonly found in applications to optimization and reinforcement learning. The general constant gain algorithm may be expressed as the time-inhomogeneous ODE $ \frac{d}{dt}\Theta_t=\alpha f_t (\Theta_t)$, with state process $\Theta$ evolving on $\mathbb{R}^d$. Theory is based on an almost periodic vector field, so that in particular the time average of $f_t(\theta)$ defines the time-homogeneous mean vector field $\bar{f} \colon \mathbb{R}^d \to \mathbb{R}^d$ with $\bar{f}(\theta^*)=0$. Under smoothness assumptions on the functions involved, the following exact representation is obtained: \[\frac{d}{dt}\Theta_t=\alpha[\bar{f}(\Theta_t)-\alpha\bar\Upsilon_t+\alpha^2\mathcal{W}_t^0+\alpha\frac{d}{dt}\mathcal{W}_t^1+\frac{d^2}{dt^2}\mathcal{W}_t^2]\] along with formulae for the smooth signals $\{\bar \Upsilon_t , \mathcal{W}_t^i : i=0, 1, 2\}$. This new representation, combined with new conditions for ultimate boundedness, has many applications for furthering the theory of QSA and its applications, including the following implications that are developed in this paper: (i) A proof that the estimation error $\|\Theta_t-\theta^*\|$ is of order $O(\alpha)$, but can be reduced to $O(\alpha^2)$ using a second order linear filter. (ii) In application to extremum seeking control, it is found that the results do not apply because the standard algorithms are not Lipschitz continuous. A new approach is presented to ensure that the required Lipschitz bounds hold, and from this we obtain stability, transient bounds, and asymptotic bias of order $O(\alpha^2)$, and asymptotic variance of order $O(\alpha^4)$. (iii) It is in general possible to obtain better than $O(\alpha)$ bounds on error in traditional stochastic approximation when there is Markovian noise.

A Lyapunov Theory for Finite-Sample Guarantees of Markovian Stochastic Approximation

A Lyapunov Theory for Finite-Sample Guarantees of Asynchronous Q-Learning and TD-Learning Variants

Stochastic Approximation with Unbounded Markovian Noise: A General-Purpose Theorem

Finite-Time Error Bounds of Biased Stochastic Approximation With Application to TD-Learning

Tight Finite Time Bounds of Two-Time-Scale Linear Stochastic Approximation with Markovian Noise

Stochastic Approximation Beyond Gradient for Signal Processing and Machine Learning

Stochastic Approximation with Delayed Updates: Finite-Time Rates under Markovian Sampling

Optimal and instance-dependent guarantees for Markovian linear stochastic approximation

Finite-sample Guarantees for Nash Q-learning with Linear Function Approximation

Markovian Foundations for Quasi-Stochastic Approximation in Two Timescales: Extended Version

Finite-Time Error Bounds For Linear Stochastic Approximation and TD Learning

Langevin Dynamics: A Unified Perspective on Optimization via Lyapunov Potentials

A Statistical Online Inference Approach in Averaged Stochastic Approximation

The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise

Online Statistical Inference for Nonlinear Stochastic Approximation with Markovian Data

The Curse of Memory in Stochastic Approximation: Extended Version

Markovian Foundations for Quasi-Stochastic Approximation with Applications to Extremum Seeking Control

Stochastic Approximation for Nonlinear Discrete Stochastic Control: Finite-Sample Bounds

Finite-Sample Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation

Scalable Statistical Inference in Non-parametric Least Squares

Concentration of Contractive Stochastic Approximation: Additive and Multiplicative Noise