High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance

Abdurakhmon Sadiev,Marina Danilova,Eduard Gorbunov,Samuel Horváth,Gauthier Gidel,Pavel Dvurechensky,Alexander Gasnikov,Peter Richtárik

2023-07-18

Abstract:During recent years the interest of optimization and machine learning communities in high-probability convergence of stochastic optimization methods has been growing. One of the main reasons for this is that high-probability complexity bounds are more accurate and less studied than in-expectation ones. However, SOTA high-probability non-asymptotic convergence results are derived under strong assumptions such as the boundedness of the gradient noise variance or of the objective's gradient itself. In this paper, we propose several algorithms with high-probability convergence results under less restrictive assumptions. In particular, we derive new high-probability convergence results under the assumption that the gradient/operator noise has bounded central $\alpha$-th moment for $\alpha \in (1,2]$ in the following setups: (i) smooth non-convex / Polyak-Lojasiewicz / convex / strongly convex / quasi-strongly convex minimization problems, (ii) Lipschitz / star-cocoercive and monotone / quasi-strongly monotone variational inequalities. These results justify the usage of the considered methods for solving problems that do not fit standard functional classes studied in stochastic optimization.

Optimization and Control,Machine Learning

What problem does this paper attempt to address?

The paper addressed in the text aims to solve issues related to the convergence of stochastic optimization methods under relaxed assumptions compared to those typically required in the literature. Specifically, the authors focus on developing algorithms with high-probability convergence guarantees that can handle scenarios where the variance of the gradient noise is unbounded, which is a significant departure from the common assumption of bounded variance. ### Key Contributions 1. **Relaxed Assumptions**: The paper proposes algorithms that can handle cases where the variance of the gradient noise is unbounded, specifically when the noise has a bounded central α-th moment for α ∈ (1, 2]. This allows for handling heavy-tailed distributions of the noise, which are more realistic in many practical applications. 2. **Extensive Coverage of Problems**: - **Minimization Problems**: Smooth non-convex, Polyak-Łojasiewicz (PL), convex, strongly convex, and quasi-strongly convex problems. - **Variational Inequalities**: Lipschitz, star-cocoercive, monotone, and quasi-strongly monotone variational inequalities. 3. **New High-Probability Convergence Results**: - For clipped-SGD and clipped-SSTM in various optimization settings, including convex, strongly convex, PL, and quasi-strongly convex minimization problems. - For clipped-SEG and clipped-SGDA in solving variational inequalities under different structured non-monotonicity assumptions. 4. **Optimality of Results**:

High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance

High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise

High-Probability Complexity Bounds for Non-smooth Stochastic Convex Optimization with Heavy-Tailed Noise

High-Probability Convergence for Composite and Distributed Stochastic Minimization and Variational Inequalities with Heavy-Tailed Noise

Method with batching for stochastic finite-sum variational inequalities in non-Euclidean setting

Generalized Smooth Stochastic Variational Inequalities: Almost Sure Convergence and Convergence Rates

Methods for Solving Variational Inequalities with Markovian Stochasticity

High Probability Convergence Bounds for Non-convex Stochastic Gradient Descent with Sub-Weibull Noise

Stochastic Nonsmooth Convex Optimization with Heavy-Tailed Noises: High-Probability Bound, In-Expectation Rate and Initial Distance Adaptation

Stochastic Methods in Variational Inequalities: Ergodicity, Bias and Refinements

Optimal Analysis of Method with Batching for Monotone Stochastic Finite-Sum Variational Inequalities

Variational Analysis in the Wasserstein Space

Convergence analysis of stochastic higher-order majorization-minimization algorithms

Optimistic Dual Extrapolation for Coherent Non-monotone Variational Inequalities

High-probability complexity guarantees for nonconvex minimax problems

Universal methods for variational inequalities: deterministic and stochastic cases

Provable convergence guarantees for black-box variational inference

Stochastic Variance-Reduced Majorization-Minimization Algorithms

Variance reduction techniques for stochastic proximal point algorithms

Adaptive Methods or Variational Inequalities with Relatively Smooth and Reletively Strongly Monotone Operators