A consistently adaptive trust-region method

Fadi Hamad,Oliver Hinder
2024-08-04
Abstract:Adaptive trust-region methods attempt to maintain strong convergence guarantees without depending on conservative estimates of problem properties such as Lipschitz constants. However, on close inspection, one can show existing adaptive trust-region methods have theoretical guarantees with severely suboptimal dependence on problem properties such as the Lipschitz constant of the Hessian. For example, TRACE developed by Curtis et al. obtains a $O(\Delta_f L^{3/2} \epsilon^{-3/2}) + \tilde{O}(1)$ iteration bound where $L$ is the Lipschitz constant of the Hessian. Compared with the optimal $O(\Delta_f L^{1/2} \epsilon^{-3/2})$ bound this is suboptimal with respect to $L$. We present the first adaptive trust-region method which circumvents this issue and requires at most $O( \Delta_f L^{1/2} \epsilon^{-3/2}) + \tilde{O}(1)$ iterations to find an $\epsilon$-approximate stationary point, matching the optimal iteration bound up to an additive logarithmic term. Our method is a simple variant of a classic trust-region method and in our experiments performs competitively with both ARC and a classical trust-region method.
Optimization and Control
What problem does this paper attempt to address?
This paper attempts to address the problem of developing a consistent and adaptive trust region method for handling non-convex functions with Lipschitz continuous Hessian matrices. Specifically, existing adaptive trust region methods are overly conservative in their theoretical guarantees regarding problem properties (such as the Lipschitz constant of the Hessian), resulting in less than ideal convergence rates. For example, the convergence rate of the TRACE algorithm is O(∆fL^(3/2)ϵ^(-3/2)), while the optimal rate should be O(∆fL^(1/2)ϵ^(-3/2)). Therefore, this paper proposes a new trust region method that can achieve the optimal iteration count of O(∆fL^(1/2)ϵ^(-3/2) + ˜O(1)) without knowing the Lipschitz constant of the Hessian, thereby matching the optimal convergence bound and demonstrating performance comparable to classical trust region methods in experiments. Additionally, the method exhibits quadratic convergence in regions satisfying local optimality conditions.