The Nesterov-Spokoiny Acceleration: $o(1/k^2)$ Convergence without Proximal Operations

Weibin Peng,Tianyu Wang
2023-01-01
Abstract: This paper studies a variant of an accelerated gradient algorithm of Nesterov and Spokoiny. We call this algorithm the Nesterov-Spokoiny Acceleration (NSA). The NSA algorithm satisfies the following properties for smooth convex programs, 1. The sequence $\{ \mathbf{x}_k \}_{k \in \mathbb{N}} $ governed by the NSA satisfies $ \limsup\limits_{k \to \infty } k^2 ( f (\mathbf{x}_k ) - f^* ) = 0 $, where $f^* > -\infty$ is the minimum of the smooth convex function $f$. 2. The sequence $\{ \mathbf{x}_k \}_{k \in \mathbb{N}} $ governed by the NSA satisfies $ \liminf\limits_{k \to \infty } k^2 \log k \log\log k ( f (\mathbf{x}_k ) - f^* ) = 0 $. 3. The sequence $\{ \mathbf{y}_k \}_{k \in \mathbb{N}} $ governed by NSA satisfies $ \liminf\limits_{k \to \infty } k^3 \log k \log\log k \| \nabla f ( \mathbf{y}_k ) \|^2 = 0 $. Item 1 above is perhaps more important than items 2 and 3: For general smooth convex programs, NSA is the first gradient algorithm that achieves $o(k^{-2})$ convergence rate without proximal operations. Some extensions of the NSA algorithm are also studied. Also, our study on a zeroth-order variant of NSA shows that $o(1/k^2)$ convergence can be achieved via estimated gradient.
What problem does this paper attempt to address?