Newton Method Revisited: Global Convergence Rates up to $\mathcal {O}\left(k^{-3} \right)$ for Stepsize Schedules and Linesearch Procedures

Slavomír Hanzely,Farshed Abdukhakimov,Martin Takáč
2024-11-20
Abstract:This paper investigates the global convergence of stepsized Newton methods for convex functions with Hölder continuous Hessians or third derivatives. We propose several simple stepsize schedules with fast global convergence guarantees, up to $\mathcal {O}\left(k^{-3} \right)$. For cases with multiple plausible smoothness parameterizations or an unknown smoothness constant, we introduce a stepsize linesearch and a backtracking procedure with provable convergence as if the optimal smoothness parameters were known in advance. Additionally, we present strong convergence guarantees for the practically popular Newton method with exact linesearch.
Optimization and Control
What problem does this paper attempt to address?
The problem that this paper attempts to solve is related to the global convergence of Newton's method, especially for convex functions with Hölder - continuous Hessian matrices or third - order derivatives. Specifically, the objectives of the paper include the following aspects: 1. **Improving the global convergence rate**: The paper proposes a new step - size scheduling scheme, enabling the step - size - adjusted Newton's method to achieve a global convergence rate as high as \(O(k^{-3})\). This is a significant improvement over existing methods (such as the \(O(k^{-2})\) convergence rate proposed by Hanzely et al. [2022]). 2. **Handling the case of unknown smoothing parameters**: In practical applications, smoothing parameters are usually unknown. The paper introduces a step - length line - search and backtracking procedure, which can ensure convergence even when the optimal smoothing parameters are not known, just as if these parameters were known in advance. 3. **Analyzing the limitations of the classical Newton's method**: The classical Newton's method may diverge when far from the solution point. By combining step - size scheduling, line - search and other techniques, the paper solves this problem and ensures the global convergence of Newton's method. 4. **Viewing Newton's method as a third - order tensor method**: By assuming the Hölder - continuity of the third - order derivative, the paper redefines Newton's method to be similar to the third - order tensor method. This perspective provides a new theoretical basis for the design of optimization algorithms. 5. **Providing performance guarantees in practical applications**: The paper not only focuses on theoretical analysis, but also shows through experimental comparisons that the proposed algorithms (RN, UN and GRLS) are superior to existing methods in most scenarios. ### Summary of the core problem The core problem of the paper is: How to design a simple and efficient step - size scheduling scheme to improve the global convergence rate of Newton's method and make it suitable for convex function optimization problems with Hölder - continuity? At the same time, the paper also explores how to ensure convergence in the case of unknown smoothing parameters and proposes corresponding line - search and backtracking strategies. Through the above research, the paper fills the gap in the global convergence analysis of Newton's method and provides more powerful optimization tools for the fields of deep learning and scientific computing.