Abstract:Non-asymptotic convergence analysis of quasi-Newton methods has gained attention with a landmark result establishing an explicit local superlinear rate of O$((1/\sqrt{t})^t)$. The methods that obtain this rate, however, exhibit a well-known drawback: they require the storage of the previous Hessian approximation matrix or all past curvature information to form the current Hessian inverse approximation. Limited-memory variants of quasi-Newton methods such as the celebrated L-BFGS alleviate this issue by leveraging a limited window of past curvature information to construct the Hessian inverse approximation. As a result, their per iteration complexity and storage requirement is O$(\tau d)$ where $\tau\le d$ is the size of the window and $d$ is the problem dimension reducing the O$(d^2)$ computational cost and memory requirement of standard quasi-Newton methods. However, to the best of our knowledge, there is no result showing a non-asymptotic superlinear convergence rate for any limited-memory quasi-Newton method. In this work, we close this gap by presenting a Limited-memory Greedy BFGS (LG-BFGS) method that can achieve an explicit non-asymptotic superlinear rate. We incorporate displacement aggregation, i.e., decorrelating projection, in post-processing gradient variations, together with a basis vector selection scheme on variable variations, which greedily maximizes a progress measure of the Hessian estimate to the true Hessian. Their combination allows past curvature information to remain in a sparse subspace while yielding a valid representation of the full history. Interestingly, our established non-asymptotic superlinear convergence rate demonstrates an explicit trade-off between the convergence speed and memory requirement, which to our knowledge, is the first of its kind. Numerical results corroborate our theoretical findings and demonstrate the effectiveness of our method.

Explicit Convergence Rates of Greedy and Random Quasi-Newton Methods

Greedy and Random Quasi-Newton Methods with Faster Explicit Superlinear Convergence.

Faster Explicit Superlinear Convergence for Greedy and Random Quasi-Newton Methods

Adaptive Greedy Quasi-Newton with Superlinear Rate and Global Convergence Guarantee

Explicit Superlinear Convergence Rates of Broyden's Methods in Nonlinear Equations

Incremental Quasi-Newton Methods with Faster Superlinear Convergence Rates

Distributed Adaptive Greedy Quasi-Newton Methods with Explicit Non-asymptotic Convergence Bounds

Limited-Memory Greedy Quasi-Newton Method with Non-asymptotic Superlinear Convergence Rate

Online Learning Guided Quasi-Newton Methods with Global Non-Asymptotic Convergence

Greedy and Random Broyden's Methods with Explicit Superlinear Convergence Rates in Nonlinear Equations

Explicit Superlinear Convergence of Broyden's Method in Nonlinear Equations

A Stochastic Quasi-Newton Method for Non-convex Optimization with Non-uniform Smoothness

Online Learning Guided Curvature Approximation: A Quasi-Newton Method with Global Non-Asymptotic Superlinear Convergence

Explicit Superlinear Convergence Rates of The SR1 Algorithm

Global non-asymptotic super-linear convergence rates of regularized proximal quasi-Newton methods on non-smooth composite problems

Accelerated Quasi-Newton Proximal Extragradient: Faster Rate for Smooth Convex Optimization

A Single-Loop Stochastic Proximal Quasi-Newton Method for Large-Scale Nonsmooth Convex Optimization

Zeroth-order Gradient and Quasi-Newton Methods for Nonsmooth Nonconvex Stochastic Optimization

Revisiting Sub-sampled Newton Methods

A Unifying Framework for Convergence Analysis of Approximate Newton Methods.