Abstract:The Newton, Gauss--Newton and Levenberg--Marquardt methods all use the first derivative of a vector function (the Jacobian) to minimise its sum of squares. When the Jacobian matrix is ill-conditioned, the function varies much faster in some directions than others and the space of possible improvement in sum of squares becomes a long narrow ellipsoid in the linear model. This means that even a small amount of nonlinearity in the problem parameters can cause a proposed point far down the long axis of the ellipsoid to fall outside of the actual curved valley of improved values, even though it is quite nearby. This paper presents a differential equation that `follows' these valleys, based on the technique of geodesic acceleration, which itself provides a 2$^\mathrm{nd}$ order improvement to the Levenberg--Marquardt iteration step. Higher derivatives of this equation are computed that allow $n^\mathrm{th}$ order improvements to the optimisation methods to be derived. These higher-order accelerated methods up to 4$^\mathrm{th}$ order are tested numerically and shown to provide substantial reduction of both number of steps and computation time.

What problem does this paper attempt to address?

The paper aims to address the "narrow curved valley" problem encountered during optimization, where the optimization function changes much faster in some directions than others when the Jacobian matrix is ill-conditioned, resulting in the improvement space in the linear model becoming a long and narrow ellipsoid. Even with small nonlinearity, proposed points far along the ellipsoid's major axis may fall outside the actual improvement curve. To solve this issue, the paper proposes a differential equation based on geodesic acceleration techniques to "follow" these curves and provides a second-order improved Levenberg-Marquardt iteration step. By computing higher-order derivatives, higher-order optimization methods can be derived. Experimental results show that these higher-order acceleration methods can significantly reduce the number of iterations and computation time. Specifically, the paper presents the following key points: 1. **Natural Optimization Path**: Defines an implicit equation $ f(x(t)) = (1-t)f(x(0)) $, where $ t \in [0,1] $, to uniformly scale all components of the error vector. 2. **Higher-Order Derivatives**: Uses the Faà di Bruno formula to compute the second and higher-order derivatives of the natural path and derives the corresponding higher-order acceleration terms. 3. **Finite Difference Scheme**: Proposes a finite difference scheme for computing multi-directional derivatives to implement higher-order correction terms in numerical methods. 4. **Numerical Tests**: Validates the performance of the higher-order algorithms through a simple function, particularly under different anisotropy factors $ K $. 5. **Practical Application**: Applies these algorithms to a complex physical problem (ion focusing), demonstrating the significant acceleration effect of higher-order methods in the optimization process. In summary, the paper improves optimization methods like Levenberg-Marquardt by introducing higher-order correction terms, addressing the "narrow curved valley" problem in the optimization process, thereby enhancing optimization efficiency.

Higher-Order Corrections to Optimisers based on Newton's Method

Adapting Newton's Method to Neural Networks through a Summary of Higher-Order Derivatives

Higher-Order Newton Methods with Polynomial Work per Iteration

Yet another fast variant of Newton’s method for nonconvex optimization

Super-Universal Regularized Newton Method

Worst-case evaluation complexity and optimality of second-order methods for nonconvex smooth optimization

Fourth‐order variants of Newton's method without second derivatives for solving non‐linear equations

First and zeroth-order implementations of the regularized Newton method with lazy approximated Hessians

An Optimal Fourth Order Derivative-Free Numerical Algorithm for Multiple Roots

Online Learning Guided Quasi-Newton Methods with Global Non-Asymptotic Convergence

Second-order optimization with lazy Hessians

Second-order Neural Network Training Using Complex-step Directional Derivative

Inexact Newton-type Methods for Optimisation with Nonnegativity Constraints

Improved global performance guarantees of second-order methods in convex minimization

On the order optimality of the regularization via inexact Newton iterations

Quadratic Gradient: Combining Gradient Algorithms and Newton's Method as One

Using second-order information in gradient sampling methods for nonsmooth optimization

Scalable Subspace Methods for Derivative-Free Nonlinear Least-Squares Optimization

A structured modified Newton approach for solving systems of nonlinear equations arising in interior-point methods for quadratic programming

Nesterov's Acceleration For Approximate Newton.

Coderivative-Based Newton Methods in Structured Nonconvex and Nonsmooth Optimization