Abstract:With the growing interest and applications in machine learning and data science, finding an efficient method to sparse analysis the high-dimensional data and optimizing a dimension reduction model to extract lower dimensional features has becoming more and more important. Orthogonal constraints (Stiefel manifold) is a commonly met constraint in these applications, and the sparsity is usually enforced through the element-wise L1 norm. Many applications can be found on optimization over Stiefel manifold within the area of physics and machine learning. In this paper, we propose a novel idea by tackling the Stiefel manifold through an nonlinear eigen-approach by first using ADMM to split the problem into smooth optimization over manifold and convex non-smooth optimization, and then transforming the former into the form of nonlinear eigenvalue problem with eigenvector dependency (NEPv) which is solved by self-consistent field (SCF) iteration, and the latter can be found to have an closed-form solution through proximal gradient. Compared with existing methods, our proposed algorithm takes the advantage of specific structure of the objective function, and has efficient convergence results under mild assumptions.

What problem does this paper attempt to address?

This paper attempts to solve the problem of sparse optimization on the Stiefel manifold (i.e., orthogonal constraints), especially for the optimization of dimensionality reduction models for high - dimensional data. Specifically, the paper proposes a new method to deal with non - smooth composite minimization problems with orthogonal constraints. Its goal is to find an effective method for sparse analysis of high - dimensional data and extract low - dimensional features. ### Problem Description The main optimization problem discussed in the paper is: \[ \min_{X} f(X)+r(X) \quad \text{s.t.} \quad X \in S_{n,p} \] where: - \( S_{n,p}=\{X \in \mathbb{R}^{n \times p} \mid X^{\top}X = I_p\} \) represents the Stiefel manifold. - \( f:\mathbb{R}^{n \times p}\to\mathbb{R} \) is a differentiable but possibly non - convex function, and its gradient \( \nabla f(X) \) satisfies Lipschitz continuity. - \( r:\mathbb{R}^{n \times p}\to\mathbb{R} \) is a convex but possibly non - smooth function, usually using the element - wise \( \ell_1 \) norm to enforce sparsity. ### Limitations of Existing Methods Existing methods are mainly divided into two categories: 1. **Riemannian algorithms**: such as the Riemannian sub - gradient method and MADMM (Manifold Alternating Direction Method of Multipliers). These methods are mainly used to handle smooth objective functions, but they are not effective for non - smooth objective functions. 2. **Lagrange multiplier methods**: such as PAMAL (Proximal Alternating Minimized Augmented Lagrangian). By introducing an indicator function, the minimization problem with orthogonal constraints is transformed into an unconditional minimization problem. However, these methods are more sensitive in parameter settings and have high computational complexity. ### Proposed New Method The paper proposes a new method based on the nonlinear eigenvalue problem (NEPv) and the alternating direction multiplier method (ADMM), called NEPv ADMM. The main steps of this method are as follows: 1. **Variable splitting**: Decompose the original problem into two sub - problems, one is a smooth optimization problem on the manifold, and the other is a simple convex optimization problem. 2. **ADMM framework**: Solve these two sub - problems respectively through the ADMM framework. 3. **NEPv transformation**: Transform the smooth optimization problem on the manifold into a nonlinear eigenvalue problem and solve it using the self - consistent field (SCF) iteration. 4. **Proximal gradient method**: For the convex optimization sub - problem, use the proximal gradient method to solve the closed - form solution. ### Main Contributions 1. **Efficiency**: The new method makes full use of the structural characteristics of the objective function and has high convergence efficiency. 2. **Applicability**: It is applicable to various applications in machine learning and data science, such as sparse principal component analysis (sPCA), orthogonal dictionary learning (ODL), compression patterns in physics, etc. 3. **Innovation**: For the first time, NEPv and ADMM are combined and applied to non - smooth optimization problems with orthogonal constraints. ### Conclusion The NEPv ADMM method proposed in the paper shows significant advantages in dealing with non - smooth optimization problems with orthogonal constraints, especially in terms of sparsity and dimensionality reduction. Future work will further study the convergence properties of this method and extend it to a wider range of objective function types.

Nonlinear Eigen-approach ADMM for Sparse Optimization on Stiefel Manifold

Convergence of Projected Subgradient Method with Sparse or Low-Rank Constraints

A Theory of the NEPv Approach for Optimization On the Stiefel Manifold

An Alternating Manifold Proximal Gradient Method for Sparse Principal Component Analysis and Sparse Canonical Correlation Analysis

Manifold Quadratic Penalty Alternating Minimization for Sparse Principal Component Analysis

A Riemannian ADMM

ADMM for Nonsmooth Composite Optimization under Orthogonality Constraints

Orthogonal Directions Constrained Gradient Method: from non-linear equality constraints to Stiefel manifold

ManiDec: Manifold Constrained Low-Rank and Sparse Decomposition

Decentralized Weakly Convex Optimization Over the Stiefel Manifold

An Accelerated Stochastic ADMM for Nonconvex and Nonsmooth Finite-Sum Optimization

Linearized Alternating Direction Method with Adaptive Penalty for Low-Rank Representation

An Extension of Fast Iterative Shrinkage‐thresholding Algorithm to Riemannian Optimization for Sparse Principal Component Analysis

An Extension of Fast Iterative Shrinkage-thresholding to Riemannian Optimization for Sparse Principal Component Analysis

A Riemannian optimization method on the indefinite Stiefel manifold

A Penalty-Free Infeasible Approach for a Class of Nonsmooth Optimization Problems Over the Stiefel Manifold

Global Optimization with Orthogonality Constraints Via Stochastic Diffusion on Manifold

Riemannian Stochastic Proximal Gradient Methods for Nonsmooth Optimization over the Stiefel Manifold

A Riemannian Alternating Descent Ascent Algorithmic Framework for Nonconvex-Linear Minimax Problems on Riemannian Manifolds