Seth J. Alderman,Roan W. Luikart,Nicholas F. Marshall
Abstract:This paper studies the effect of adding geometrically smoothed momentum to the randomized Kaczmarz algorithm, which is an instance of stochastic gradient descent on a linear least squares loss function. We prove a result about the expected error in the direction of singular vectors of the matrix defining the least squares loss. We present several numerical examples illustrating the utility of our result and pose several questions.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to accelerate the convergence rate of the randomized Kaczmarz algorithm in the direction of small singular vectors by introducing geometrically smoothed momentum.
### Problem Background
The randomized Kaczmarz algorithm is an iterative method for solving linear least - squares problems. Essentially, it is a stochastic gradient descent algorithm with a batch size of 1. Specifically, given a matrix \(A\in\mathbb{R}^{m\times n}\) and a vector \(b\in\mathbb{R}^m\), the least - squares loss function is defined as:
\[
f(x)=\frac{1}{2}\|Ax - b\|_2^2=\frac{1}{2}\sum_{i = 1}^m(\langle a_i,x\rangle - b_i)^2
\]
where \(a_i^{\top}\) is the \(i\)-th row of matrix \(A\), and \(b_i\) is the \(i\)-th element of vector \(b\). The randomized Kaczmarz algorithm is updated by the following iterative formula:
\[
x_{k + 1}=x_k+\frac{b_{i_k}-\langle a_{i_k},x_k\rangle}{\|a_{i_k}\|_2^2}a_{i_k}
\]
where \(i_k\) is an index randomly selected from \(\{1,\ldots,m\}\).
### Core Problem of the Paper
The paper explores how to improve the convergence performance of the randomized Kaczmarz algorithm by introducing geometrically smoothed momentum. Specifically, the authors propose the randomized Kaczmarz algorithm with geometrically smoothed momentum (KGSM), whose iterative formula is as follows:
\[
\begin{cases}
x_{k+1}=x_k+\frac{b_{i_k}-\langle a_{i_k},x_k\rangle}{\|a_{i_k}\|_2^2}a_{i_k}+My_k\\
y_{k+1}=\beta y_k+(1 - \beta)(x_{k+1}-x_k)
\end{cases}
\]
where \(M\in[0,1]\) and \(\beta\in[0,1)\) are parameters that control the momentum. The main results of the authors show that in the direction of small singular vectors, KGSM can significantly accelerate the convergence rate.
### Main Contributions
1. **Introduced the KGSM algorithm**: Through a simple modification, geometrically smoothed momentum is introduced into the randomized Kaczmarz algorithm, and its rich dynamic behaviors are demonstrated.
2. **Answered the question proposed by Steinerberger**: That is, how to use the convergence results along the singular vector directions to accelerate convergence. The authors verified the effectiveness of this idea through theoretical analysis and numerical experiments.
3. **Established an in - depth understanding of stochastic optimization methods in linear algebra**: In particular, the relationship between geometrically smoothed momentum and singular values and singular vectors.
### Theoretical Results
The main theoretical result of the authors is about the expected error of the randomized Kaczmarz algorithm with geometrically smoothed momentum in the direction of singular vectors. For given parameters \(\beta\in[0,1)\) and \(M\in[0,1]\), and the initial vector \(x_0\), the iterative formula of the algorithm is:
\[
\begin{cases}
x_{k+1}=x_k+\frac{b_{i_k}-\langle a_{i_k},x_k\rangle}{\|a_{i_k}\|_2^2}a_{i_k}+My_k\\
y_{k+1}=\beta y_k+(1 - \beta)(x_{k+1}-x_k)
\end{cases}
\]
Main Theorem (Theor