A globalization of L-BFGS and the Barzilai-Borwein method for nonconvex unconstrained optimization

Florian Mannel
2024-09-11
Abstract:We present a modified limited memory BFGS (L-BFGS) method that converges globally and linearly for nonconvex objective functions. Its distinguishing feature is that it turns into L-BFGS if the iterates cluster at a point near which the objective is strongly convex with Lipschitz gradients, thereby inheriting the outstanding effectiveness of the classical method. These strong convergence guarantees are enabled by a novel form of cautious updating, where, among others, it is decided anew in each iteration which of the stored pairs are used for updating and which ones are skipped. In particular, this yields the first modification of cautious updating for which all cluster points are stationary while the spectrum of the L-BFGS operator is not permanently restricted, and this holds without Lipschitz continuity of the gradient. In fact, for Wolfe-Powell line searches we show that continuity of the gradient is sufficient for global convergence, which extends to other descent methods. Since we allow the memory size to be zero in the globalized L-BFGS method, we also obtain a new globalization of the Barzilai-Borwein spectral gradient (BB) method. The convergence analysis is developed in Hilbert space under comparably weak assumptions and covers Armijo and Wolfe-Powell line searches. We illustrate the theoretical findings with numerical experiments. The experiments indicate that if one of the parameters of the cautious updating is chosen sufficiently small, then the modified method agrees entirely with L-BFGS/BB. We also discuss this in the theoretical part. An implementation of the new method is available on arXiv.
Optimization and Control
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to address the issue of global convergence in the L-BFGS method for non-convex unconstrained optimization problems. Specifically, the authors propose an improved Limited-memory BFGS (L-BFGS) method that can achieve global and linear convergence under non-convex objective functions. The main contributions of the paper include: 1. **Global Convergence**: The improved L-BFGS method can achieve global convergence on non-convex problems, meaning all accumulation points are stationary points. 2. **Linear Convergence Rate**: When the iterates cluster around a point where the objective function is strongly convex and the gradient is Lipschitz continuous, the method can converge at a linear rate, eventually aligning with the classical L-BFGS method. 3. **Cautious Update**: A new cautious update mechanism is introduced to ensure that in each iteration, it is re-evaluated which stored pairs are used for updates and which are skipped. This mechanism ensures that all accumulation points are stationary points, and the spectrum of the L-BFGS operator is not permanently restricted. 4. **Applicability to Zero Memory Case**: When the memory size of the L-BFGS method is allowed to be zero, the method can also be extended to a globalized version of the Barzilai–Borwein spectral gradient (BB) method. With these improvements, the paper provides an L-BFGS method with stronger convergence properties and higher efficiency for non-convex optimization problems.