A nearly linearly convergent first-order method for nonsmooth functions with quadratic growth

Damek Davis,Liwei Jiang
2023-07-17
Abstract:Classical results show that gradient descent converges linearly to minimizers of smooth strongly convex functions. A natural question is whether there exists a locally nearly linearly convergent method for nonsmooth functions with quadratic growth. This work designs such a method for a wide class of nonsmooth and nonconvex locally Lipschitz functions, including max-of-smooth, Shapiro's decomposable class, and generic semialgebraic functions. The algorithm is parameter-free and derives from Goldstein's conceptual subgradient method.
Optimization and Control
What problem does this paper attempt to address?
### The problems the paper attempts to solve This paper aims to solve the problem of whether there exists a first - order optimization method with local approximate linear convergence for non - smooth functions with quadratic growth properties. Specifically, the paper proposes an algorithm named Normal Tangent Descent (NTDescent), which is applicable to a wide class of non - smooth, non - convex and locally Lipschitz continuous functions, including max - smooth functions, Shapiro's decomposable class functions and general semi - algebraic functions. The NTDescent algorithm does not require parameter tuning and is designed based on Goldstein's concept of subgradient method. ### Main contributions 1. **Propose the NTDescent algorithm**: This algorithm achieves local approximate linear convergence on non - smooth functions with typical structures (i.e., quadratic growth and smooth sub - structures). The typical structure means that the function has a specific smooth sub - manifold (active manifold) near the local minimum, along which the function is smooth, and perpendicular to this manifold the function grows rapidly. 2. **Prove convergence**: The paper proves that the NTDescent algorithm has local approximate linear convergence on non - smooth functions with typical structures, and its convergence speed and fast local convergence region only depend on the function itself, not on the dimension of the problem. 3. **Analyze typical structures**: The paper analyzes in detail the function classes with typical structures, including semi - algebraic functions and decomposable loss functions, and shows that these functions have the required smooth sub - structures and quadratic growth properties near the local minimum. 4. **Comparison with existing methods**: Through experimental comparison, NTDescent outperforms the traditional Polyak subgradient method (PolyakSGM) in performance, especially more significantly in high - dimensional problems. ### Formulas and concepts - **Goldstein sub - differential**: \[ \partial_\sigma f(x) := \text{conv}\left(\bigcup_{y \in B_\sigma(x)} \partial f(y)\right) \quad \forall x \in \mathbb{R}^d, \sigma > 0 \] - **Descent property**: \[ f\left(x - \frac{\sigma w}{\|w\|}\right) \leq f(x) - \sigma \|w\| \quad \text{if } w \neq 0 \] - **Gradient inequality**: \[ \sigma(x) \cdot \text{dist}(0, \partial_{\sigma(x)} f(x)) \geq \eta (f(x) - f(\bar{x})) \] - **NTDescent algorithm**: \[ x_{k + 1}=x_k - \frac{\sigma_k w_k}{\|w_k\|} \quad \text{where } w_k = \text{MinNorm}(x_k, \sigma_k) \] ### Conclusion By proposing the NTDescent algorithm, the paper successfully solves the problem of local approximate linear convergence of non - smooth functions with quadratic growth properties. This algorithm not only has good convergence guarantees theoretically, but also performs well in practical applications, especially in high - dimensional problems.