Guillaume Olikier,Kyle A. Gallivan,P.-A. Absil
Abstract:We consider the problem of minimizing a differentiable function with locally Lipschitz continuous gradient on a stratified set and present a first-order algorithm designed to find a stationary point of that problem. Our assumptions on the stratified set are satisfied notably by the determinantal variety (i.e., matrices of bounded rank), its intersection with the cone of positive-semidefinite matrices, and the set of nonnegative sparse vectors. The iteration map of the proposed algorithm applies a step of projected-projected gradient descent with backtracking line search, as proposed by Schneider and Uschmajew (2015), to its input but also to a projection of the input onto each of the lower strata to which it is considered close, and outputs a point among those thereby produced that maximally reduces the cost function. Under our assumptions on the stratified set, we prove that this algorithm produces a sequence whose accumulation points are stationary, and therefore does not follow the so-called apocalypses described by Levin, Kileel, and Boumal (2022). We illustrate the apocalypse-free property of our method through a numerical experiment on the determinantal variety.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to minimize a differentiable function with a locally Lipschitz continuous gradient on a stratified set. Specifically, the paper focuses on finding the stable points of this optimization problem on specific stratified sets (such as determinant varieties, intersections of positive semidefinite matrix cones, and non - negative sparse vector sets). The paper proposes a first - order algorithm (P2GDR), aiming to overcome the "doomsday" phenomenon (i.e., the situation where the algorithm may converge to non - stable points) that existing methods may encounter, and proves that under appropriate assumptions, all the cluster points of the sequence generated by this algorithm are stable points. In addition, the paper shows the effectiveness of P2GDR in dealing with problems on determinant varieties through numerical experiments, proving that it can avoid the "doomsday" phenomenon.
### Key problems
1. **Problem definition**:
- Given a Euclidean vector space \( E \), the inner product is denoted as \( \langle \cdot, \cdot \rangle \), and the induced norm is denoted as \( \|\cdot\| \).
- Consider a differentiable function \( f: E \to \mathbb{R} \) whose gradient is locally Lipschitz continuous.
- Consider a non - empty closed subset \( C \subseteq E \).
- The goal is to minimize the value of \( f \) on \( C \), that is, to solve the problem:
\[
\min_{x \in C} f(x) \tag{1}
\]
2. **Definition of stable points**:
- A point \( x \in C \) is called a stable point of problem (1) if it satisfies one of the following equivalent conditions:
1. \( \langle \nabla f(x), v \rangle \geq 0 \) for all \( v \in T_C(x) \), where \( T_C(x) \) represents the tangent cone of \( C \) at \( x \).
2. \( -\nabla f(x) \in \hat{N}_C(x) \), where \( \hat{N}_C(x) \) represents the regular normal cone of \( C \) at \( x \).
3. \( s(x; f, C)=0 \), where \( s(\cdot; f, C) \) is the stability measure function, defined as:
\[
s(x; f, C)=\| P_{T_C(x)}(-\nabla f(x)) \| \tag{2}
\]
3. **Limitations of existing methods**:
- Existing first - order methods (such as the projected gradient descent method PGD) may converge to non - stable points in some cases, especially in the presence of "doomsday" points.
- A "doomsday" point refers to a point \( x \) where there exists a sequence \( (x_i)_{i \in \mathbb{N}} \) converging to \( x \), and \( s(x_i; \phi, C) \to 0 \), but \( s(x; \phi, C)>0 \).
4. **Contributions of the paper**:
- Proposes a new algorithm P2GDR, which can find stable points on stratified sets and avoid the "doomsday" phenomenon.
- Proves that under appropriate assumptions, all the cluster points of the sequence generated by P2GDR are stable points.
- Verifies the effectiveness of P2GDR through numerical experiments, especially when dealing with problems on determinant varieties.
### Conclusion
The paper proposes a new optimization algorithm P2GDR, specifically for optimization problems on stratified sets. This algorithm can not only find stable points but also avoid the "doomsday" phenomenon that existing methods may encounter. The paper verifies the effectiveness and robustness of P2GDR through theoretical analysis and numerical experiments.