Abstract:Classical results show that gradient descent converges linearly to minimizers of smooth strongly convex functions. A natural question is whether there exists a locally nearly linearly convergent method for nonsmooth functions with quadratic growth. This work designs such a method for a wide class of nonsmooth and nonconvex locally Lipschitz functions, including max-of-smooth, Shapiro's decomposable class, and generic semialgebraic functions. The algorithm is parameter-free and derives from Goldstein's conceptual subgradient method.
What problem does this paper attempt to address?
### The problems the paper attempts to solve
This paper aims to solve the problem of whether there exists a first - order optimization method with local approximate linear convergence for non - smooth functions with quadratic growth properties. Specifically, the paper proposes an algorithm named Normal Tangent Descent (NTDescent), which is applicable to a wide class of non - smooth, non - convex and locally Lipschitz continuous functions, including max - smooth functions, Shapiro's decomposable class functions and general semi - algebraic functions. The NTDescent algorithm does not require parameter tuning and is designed based on Goldstein's concept of subgradient method.
### Main contributions
1. **Propose the NTDescent algorithm**: This algorithm achieves local approximate linear convergence on non - smooth functions with typical structures (i.e., quadratic growth and smooth sub - structures). The typical structure means that the function has a specific smooth sub - manifold (active manifold) near the local minimum, along which the function is smooth, and perpendicular to this manifold the function grows rapidly.
2. **Prove convergence**: The paper proves that the NTDescent algorithm has local approximate linear convergence on non - smooth functions with typical structures, and its convergence speed and fast local convergence region only depend on the function itself, not on the dimension of the problem.
3. **Analyze typical structures**: The paper analyzes in detail the function classes with typical structures, including semi - algebraic functions and decomposable loss functions, and shows that these functions have the required smooth sub - structures and quadratic growth properties near the local minimum.
4. **Comparison with existing methods**: Through experimental comparison, NTDescent outperforms the traditional Polyak subgradient method (PolyakSGM) in performance, especially more significantly in high - dimensional problems.
### Formulas and concepts
- **Goldstein sub - differential**:
\[
\partial_\sigma f(x) := \text{conv}\left(\bigcup_{y \in B_\sigma(x)} \partial f(y)\right) \quad \forall x \in \mathbb{R}^d, \sigma > 0
\]
- **Descent property**:
\[
f\left(x - \frac{\sigma w}{\|w\|}\right) \leq f(x) - \sigma \|w\| \quad \text{if } w \neq 0
\]
- **Gradient inequality**:
\[
\sigma(x) \cdot \text{dist}(0, \partial_{\sigma(x)} f(x)) \geq \eta (f(x) - f(\bar{x}))
\]
- **NTDescent algorithm**:
\[
x_{k + 1}=x_k - \frac{\sigma_k w_k}{\|w_k\|} \quad \text{where } w_k = \text{MinNorm}(x_k, \sigma_k)
\]
### Conclusion
By proposing the NTDescent algorithm, the paper successfully solves the problem of local approximate linear convergence of non - smooth functions with quadratic growth properties. This algorithm not only has good convergence guarantees theoretically, but also performs well in practical applications, especially in high - dimensional problems.