What problem does this paper attempt to address?

This paper aims to solve the computational efficiency problem in large - scale optimization problems, especially in cases where computing the full gradient is computationally infeasible. Specifically, by using the modified equation theory of numerical integrators, the paper proposes a class of stochastic differential equations (SDEs), which describe the dynamic behavior of general stochastic optimization methods more accurately than the original gradient flow. Through the analysis of the modified stochastic differential equations, the qualitative characteristics of the relevant optimization methods can be revealed. ### Main contributions of the paper: 1. **Proposition 4.3**: A modified SDE suitable for general stochastic optimization iterations is proposed. 2. **Theorem 4.8**: Conditions are given to ensure the mean - square stable convergence of the stochastic coordinate descent method to the minimum. ### Paper structure: - **Part 1**: Introduction, which introduces the research background and purpose. - **Part 2**: Preliminary knowledge, which introduces the application of modified equations in ordinary differential equations (ODEs) and stochastic differential equations (SDEs). - **Part 3**: Discusses the main ideas of stochastic optimization methods, and focuses on two cases: stochastic gradient descent and stochastic coordinate descent. - **Part 4**: Presents the main results, including the modified SDE suitable for general stochastic optimization iterations and the mean - square stability conditions of the stochastic coordinate descent method. - **Part 5**: Conclusion, which discusses possible directions for future research. ### Key technical points: - **Modified equation**: Through the error analysis between the approximate solution generated by the numerical method and the solution of the original equation, a modified SDE is derived, which more accurately describes the dynamic behavior of the numerical method. - **Mean - square stability**: Analyzes the mean - square stability of the modified SDE, especially for the stochastic coordinate descent method, and gives conditions to ensure its stable convergence. ### Specific formulas: - **General form of the modified SDE**: \[ d\tilde{X} = \left( -\nabla F(\tilde{X}) + h F_1(\tilde{X}) \right) dt + \sqrt{h} G_1(\tilde{X}) dW \] where \( F_1 \) and \( G_1 \) satisfy: \[ F_1 = -\frac{1}{2} (\nabla \nabla F) \nabla F = -\frac{1}{4} \nabla \|\nabla F\|^2 \] \[ G_1 = \sqrt{E[(\hat{\nabla} F - \nabla F)(\hat{\nabla} F - \nabla F)^T]} \] - **Modified equation of the stochastic coordinate descent method**: \[ \Sigma(\tilde{X}) = d \sum_{i = 1}^d U_i (\nabla F(\tilde{X})) (\nabla F(\tilde{X}))^T U_i^T - (\nabla F(\tilde{X})) (\nabla F(\tilde{X}))^T \] - **Mean - square stability conditions**: \[ E[\|X(t) - X^\star\|^2] \leq e^{-\alpha t} \|X(0) - X^\star\|^2 \] where \(\alpha = 2\mu - hK + h(d - 1)L^2\), and \(X^\star\) is the unique minimum point of \(F\). If the step size satisfies \(h \leq \frac{2\mu}{(d - 1)L^2 - K}\), then: \[ \lim_{t \to \infty} E[\|X(t) - X^\star\|^2] = 0 \] ### Summary: This paper provides a new perspective for analyzing the dynamic behavior of stochastic optimization algorithms by introducing the method of modified equations, especially for the stochastic coordinate descent method, and gives conditions to ensure its stable convergence. These results not only help to understand the behavior of existing algorithms.

Backward error analysis and the qualitative behaviour of stochastic optimization algorithms: Application to stochastic coordinate descent

Bound Analysis of Natural Gradient Descent in Stochastic Optimization Setting

SDEs for Minimax Optimization

Stochastic Modified Equations and Dynamics of Stochastic Gradient Algorithms I: Mathematical Foundations

Asymptotic error analysis for stochastic gradient optimization schemes with first and second order modified equations

A backward SDE method for uncertainty quantification in deep learning

Non asymptotic analysis of Adaptive stochastic gradient algorithms and applications

The Stochastic Steepest Descent Method for Robust Optimization in Banach Spaces

Asymptotic Analysis via Stochastic Differential Equations of Gradient Descent Algorithms in Statistical and Computational Paradigms

Stochastic Differential Equations for Modeling First Order Optimization Methods

Strong backward error analysis of stochastic Poisson integrators

Error estimates of the backward Euler–Maruyama method for multi-valued stochastic differential equations

Uniform-in-Time Weak Error Analysis for Stochastic Gradient Descent Algorithms Via Diffusion Approximation

Convergence of Constant Step Stochastic Gradient Descent for Non-Smooth Non-Convex Functions

Continuous-time stochastic gradient descent for optimizing over the stationary distribution of stochastic differential equations

A Forward Propagation Algorithm for Online Optimization of Nonlinear Stochastic Differential Equations

An SDE Perspective on Stochastic Inertial Gradient Dynamics with Time-Dependent Viscosity and Geometric Damping

Almost Sure Convergence of Randomised‐difference Descent Algorithm for Stochastic Convex Optimisation

Stochastic Methods in Variational Inequalities: Ergodicity, Bias and Refinements

Weak Backward Error Analysis for SDEs

Stochastic Modified Equations and Adaptive Stochastic Gradient Algorithms