Randomized Kaczmarz with tail averaging

Ethan N. Epperly,Gil Goldshlager,Robert J. Webber
2024-11-30
Abstract:The randomized Kaczmarz (RK) method is a well-known approach for solving linear least-squares problems with a large number of rows. RK accesses and processes just one row at a time, leading to exponentially fast convergence for consistent linear systems. However, RK fails to converge to the least-squares solution for inconsistent systems. This work presents a simple fix: average the RK iterates produced in the tail part of the algorithm. The proposed tail-averaged randomized Kaczmarz (TARK) converges for both consistent and inconsistent least-squares problems at a polynomial rate, which is known to be optimal for any row-access method. An extension of TARK also leads to efficient solutions for ridge-regularized least-squares problems.
Numerical Analysis
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that the Random Kaczmarz (RK) method cannot converge to the least - squares solution when dealing with inconsistent linear least - squares problems. Specifically, for an inconsistent system (i.e., \( b \neq Ax^\star \)), the RK method can only converge within a finite range and cannot accurately find the least - squares solution. To solve this problem, the author proposes an improved method: the Tail - Averaged Random Kaczmarz (TARK) method. By averaging the RK iteration results in the tail part of the algorithm, TARK can converge to the least - squares solution at a polynomial rate in both consistent and inconsistent least - squares problems, thus overcoming the limitations of the RK method. In addition, the paper also extends the TARK method to make it applicable to ridge regression least - squares problems, further improving its applicability and performance. ### Formula Summary 1. **Least - squares problem**: \[ \min_{x \in \mathbb{R}^d} \|b - Ax\|^2 \quad \text{for} \quad A \in \mathbb{R}^{n \times d} \quad \text{and} \quad b \in \mathbb{R}^n \quad \text{with} \quad n > d \] 2. **Random Kaczmarz update formula**: \[ x_{t + 1} := x_t+\frac{b_i - a_i^\top x_t}{\|a_i\|^2} a_i \] where the row index \( i_t \) is sampled according to the probability distribution \( P\{i_t = i\} = \frac{\|a_i\|^2}{\|A\|_F^2} \). 3. **Tail - averaged estimator**: \[ \bar{x}_t := \frac{1}{t - t_b} \sum_{s = t_b}^{t - 1} x_s \] where \( t_b \) is the warm - up time and \( t \) is the final time. 4. **MSE bound of TARK**: \[ E\|\bar{x}_t - x^\star\|^2 \leq \left(1 - \kappa_{\text{dem}}^{-2}\right)^{t_b} \|x_0 - x^\star\|^2+\frac{2 \kappa_{\text{dem}}^2 - 1}{t - t_b} \|A^+\|^2 \|b - Ax^\star\|^2 \] where the Demmel condition number \( \kappa_{\text{dem}} := \|A^+\| \|A\|_F \). Through these improvements, the TARK method not only solves the convergence problem in inconsistent least - squares problems, but also provides broader application prospects, especially in large - scale data sets and high - dimensional feature spaces.