Advances in Nonmonotone Proximal Gradient Methods merely with Local Lipschitz Assumptions in the Presense of Kurdyka-Łojasiewicz Property: A Study of Average and Max Line Search

Xiaoxi Jia,Kai Wang
2024-11-29
Abstract:The proximal gradient method is a standard approach to solve the composite minimization problems where the objective function is the sum of a continuously differentiable function and a lower semicontinuous, extended-valued function. For both monotone and nonmonotone proximal gradient methods, the convergence theory has traditionally replied heavily on the assumption of global Lipschitz continuity. Recent works have shown that the monotone proximal gradient method, even when the local Lipschitz continuity (rather than global) is assumed, converges to the stationarity globally in the presence of Kurdyka-Łojasiewicz Property. However, how to extend these results from monotone proximal gradient method to nonmonotone proximal gradient method (NPG) remains an open question. In this manuscript, we consider two types of NPG: those combined with average line search and max line search, respectively. By partitioning of indices into two subsets, one of them aims to achieve a decrease in the functional sequence, we establish the global convergence and rate-of-convergence (same as the monotone version) results under the KL property, merely requiring the local Lipschitz assumption, and without an a priori knowledge of the iterative sequence being bounded. When our work is almost done, we noticed that [17] presented the analogous results for the NPG with average line search, whose partitioning of index set is totally different with ours. Drawing upon the findings in this manuscript and [17], we confidently conclude that the convergence theory of NPG is independent on the specific partitioning of the index set.
Optimization and Control
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of how to ensure global convergence and its convergence rate of the Nonmonotone Proximal Gradient (NPG) method under the assumption of only local Lipschitz continuity. Specifically: 1. **Limitations of traditional methods**: - The convergence theories of traditional monotone and non - monotone proximal gradient methods usually rely on the assumption of global Lipschitz continuity of the smooth part of the objective function. However, in practical applications, this assumption is too strict and often cannot be satisfied. - Recent research has shown that in the presence of the Kurdyka - Łojasiewicz (KL) property, the monotone proximal gradient method can globally converge to a stationary point even with only the assumption of local Lipschitz continuity. But for the non - monotone proximal gradient method, this conclusion has not been proven yet. 2. **Research objectives**: - This paper attempts to extend these results from the monotone proximal gradient method to the non - monotone proximal gradient method, especially for the two types of NPG combined with the average line search and the max line search. - By dividing the index set into two subsets, one of which ensures the decrease of the function sequence, the author establishes the global convergence and convergence rate results under the assumption of only local Lipschitz continuity. 3. **Main contributions**: - The paper proves that under the premise of the KL property, the NPG method can achieve global convergence under the assumption of only local Lipschitz continuity, and its convergence rate is the same as that of the monotone version. - Further, the paper shows that the convergence of the NPG method does not depend on the specific way of dividing the index set, which provides an important theoretical progress for understanding this type of method. ### Formula presentation To ensure the correctness and readability of the formulas, the following are some key formulas involved in the paper: - **Objective function**: \[ q(x):=f(x)+g(x) \] where \( f(x) \) is continuously differentiable and \( g(x) \) is lower semi - continuous. - **Acceptance criteria**: For the average line search: \[ q(x_{k,i})\leq\Phi_k-\delta\gamma_{k,i}\frac{\|x_{k,i}-x_k\|^2}{2} \] For the max line search: \[ q(x_{l(k)})=\max_{j = 0,\ldots,\min\{m,k\}}q(x_{k - j}) \] - **Update rule**: \[ \Phi_{k + 1}:=(1 - p_k)\Phi_k+p_kq(x_{k+1}) \] - **KL inequality**: \[ \chi'(g(x)-g(x^*))\geq\frac{1}{\text{dist}(0,\partial g(x))} \] where \( \chi \) is a desingularization function. Through these formulas and theoretical analysis, the paper successfully solves the convergence problem of the non - monotone proximal gradient method under the assumption of local Lipschitz continuity.