On adaptive stochastic heavy ball momentum for solving linear systems

Yun Zeng,Deren Han,Yansheng Su,Jiaxin Xie

2024-04-03

Abstract:The stochastic heavy ball momentum (SHBM) method has gained considerable popularity as a scalable approach for solving large-scale optimization problems. However, one limitation of this method is its reliance on prior knowledge of certain problem parameters, such as singular values of a matrix. In this paper, we propose an adaptive variant of the SHBM method for solving stochastic problems that are reformulated from linear systems using user-defined distributions. Our adaptive SHBM (ASHBM) method utilizes iterative information to update the parameters, addressing an open problem in the literature regarding the adaptive learning of momentum parameters. We prove that our method converges linearly in expectation, with a better convergence bound compared to the basic method. Notably, we demonstrate that the deterministic version of our ASHBM algorithm can be reformulated as a variant of the conjugate gradient (CG) method, inheriting many of its appealing properties, such as finite-time convergence. Consequently, the ASHBM method can be further generalized to develop a brand-new framework of the stochastic CG (SCG) method for solving linear systems. Our theoretical results are supported by numerical experiments.

Optimization and Control

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper primarily aims to address the problem of solving large-scale linear systems and proposes an improved Stochastic Heavy Ball Momentum (SHBM) method. Specifically, the goals of the paper include: 1. **Solving the parameter dependency problem**: - Existing SHBM methods require prior knowledge of certain problem parameters (e.g., singular values of the matrix), which is often impractical in real-world applications. Therefore, the paper proposes an Adaptive SHBM (ASHBM) method that can dynamically update these parameters during the iteration process. 2. **Improving convergence performance**: - Existing SHBM methods are not as effective as traditional Stochastic Gradient Descent (SGD) methods in terms of the expected error convergence factor when solving linear systems. By introducing adaptive parameters αk and βk, the paper demonstrates that the ASHBM method has better convergence performance. 3. **Theoretical and experimental validation**: - The paper not only theoretically proves that the proposed ASHBM method has linear convergence but also validates its effectiveness through numerical experiments. In summary, the main objective of this paper is to propose a new adaptive SHBM method to solve large-scale linear systems, while also validating its effectiveness and superiority both theoretically and experimentally.

On adaptive stochastic heavy ball momentum for solving linear systems

On adaptive stochastic extended iterative methods for solving least squares

(Accelerated) Noise-adaptive Stochastic Heavy-Ball Momentum

Stochastic Euler Heavy Ball Method

Accelerated Stochastic ADMM with Variance Reduction

A New Variant of Stochastic Heavy Ball Optimization Method for Deep Learning

Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance

A Unified Analysis of Stochastic Momentum Methods for Deep Learning

Stochastic Momentum Method with Double Acceleration for Regularized Empirical Risk Minimization

A Unified Analysis of Stochastic Momentum Methods for Deep Learning.

Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise

On the fast convergence of minibatch heavy ball momentum

Convergence analysis of a stochastic heavy-ball method for linear ill-posed problems

Unified Convergence Analysis of Stochastic Momentum Methods for Convex and Non-convex Optimization

Unified Convergence Analysis for Adaptive Optimization with Moving Average Estimator

Nonsmooth Nonconvex Stochastic Heavy Ball

On the Convergence Analysis of Aggregated Heavy-Ball Method

Combining Conjugate Gradient and Momentum for Unconstrained Stochastic Optimization With Applications to Machine Learning

Accelerated Over-Relaxation Heavy-Ball Methods with Provable Acceleration and Global Convergence

Training Deep Neural Networks with Adaptive Momentum Inspired by the Quadratic Optimization

On the adaptive deterministic block Kaczmarz method with momentum for solving large-scale consistent linear systems