Abstract:We consider a class of structured, nonconvex, nonsmooth optimization problems under orthogonality constraints, where the objectives combine a smooth function, a nonsmooth concave function, and a nonsmooth weakly convex function. This class of problems finds diverse applications in statistical learning and data science. Existing ADMMs for addressing these problems often fail to exploit the specific structure of orthogonality constraints, struggle with nonsmooth functions and nonconvex constraint sets, or result in suboptimal oracle complexity. We propose {\sf OADMM}, an Alternating Direction Method of Multipliers (ADMM) designed to solve this class of problems using efficient proximal linearized strategies. Two specific variants of {\sf OADMM} are explored: one based on Euclidean Projection ({\sf OADMM-EP}) and the other on Riemannian retraction ({\sf OADMM-RR}). We integrate a Nesterov extrapolation strategy into {\sf OADMM-EP} and a monotone Barzilai-Borwein strategy into {\sf OADMM-RR} to potentially accelerate primal convergence. Additionally, we adopt an over-relaxation strategy in both {\sf OADMM-EP} and {\sf OADMM-RR} for rapid dual convergence. Under mild assumptions, we prove that {\sf OADMM} converges to the critical point of the problem with a provable convergence rate of $\mathcal{O}(1/\epsilon^{3})$. We also establish the convergence rate of {\sf OADMM} under the Kurdyka-Lojasiewicz (KL) inequality. Numerical experiments are conducted to demonstrate the advantages of the proposed method.

What problem does this paper attempt to address?

This paper attempts to solve a class of non - convex and non - smooth composite optimization problems with orthogonal constraints. The objective function of this type of problem combines smooth functions, non - smooth concave functions and non - smooth weakly convex functions. Specifically, the paper focuses on the following form of optimization problem: \[ \min_{X \in \mathbb{R}^{n \times r}} F(X) \triangleq f(X) - g(X) + h(A(X)), \quad \text{s.t.} \quad X^T X = I_r \] where: - $n \geq r$ - $A(X) \in \mathbb{R}^{m \times 1}$ - $I_r$ is the $r \times r$ identity matrix - $X^T X = I_r$ represents the orthogonal constraint ### Problem Background This type of optimization problem has a wide range of applications in statistical learning and data science, such as sparse principal component analysis (PCA), deep neural networks, orthogonal non - negative matrix factorization, range - based independent component analysis and dictionary learning, etc. ### Shortcomings of Existing Methods The existing alternating direction method of multipliers (ADMM) has the following shortcomings when dealing with these problems: 1. **Failure to fully utilize the specific structure of the orthogonal constraint**: Many methods do not fully utilize the characteristics of the orthogonal constraint, resulting in low efficiency. 2. **Difficulty in handling non - smooth functions and non - convex constraint sets**: Existing methods often perform poorly when handling non - smooth functions and non - convex constraint sets. 3. **Sub - optimal oracle complexity**: The results of some methods may not be optimal, resulting in a slow convergence rate. ### Contributions of the Paper To overcome the above problems, the paper proposes two variants of the OADMM (Orthogonal ADMM) algorithm: 1. **OADMM - EP**: A method based on Euclidean projection. 2. **OADMM - RR**: A method based on Riemannian retraction. ### Main Features - **Acceleration strategy**: The Nesterov extrapolation strategy is introduced in OADMM - EP, and the monotonic Barzilai - Borwein step - size strategy is adopted in OADMM - RR to accelerate the convergence of the original variables. - **Fast dual convergence**: The convergence of the dual variables is accelerated through the over - relaxation strategy. - **Convergence proof**: Under mild assumptions, it is proved that OADMM converges to the critical point of the problem with an oracle complexity of $O(1/\epsilon^3)$, and the convergence rate is established under the Kurdyka - Łojasiewicz inequality. ### Applications and Experiments The paper verifies the effectiveness of OADMM - EP and OADMM - RR through numerical experiments on the sparse PCA problem. The experimental results show that these two methods are generally superior to other state - of - the - art optimization algorithms, such as RADMM, SPGM - EP, SPGM - RR and Sub - Grad, in terms of the objective function value. ### Conclusion The paper proposes an alternating direction method of multipliers (OADMM) specifically for non - smooth composite optimization problems with orthogonal constraints, and verifies its effectiveness and superiority through theoretical analysis and numerical experiments.

ADMM for Nonsmooth Composite Optimization under Orthogonality Constraints

ADMM for Nonconvex Optimization under Minimal Continuity Assumption

A Riemannian ADMM

A Stochastic Alternating Direction Method of Multipliers for Non-smooth and Non-convex Optimization

A Unified Inexact Stochastic ADMM for Composite Nonconvex and Nonsmooth Optimization

Linearized ADMM for Nonconvex Nonsmooth Optimization with Convergence Analysis

An Empirical Study of ADMM for Nonconvex Problems

Zeroth-Order Online Alternating Direction Method of Multipliers: Convergence Analysis and Applications.

Accelerating ADMM for Efficient Simulation and Optimization

ADMM for Structured Fractional Minimization

An Inertial Proximal Alternating Direction Method of Multipliers for Nonconvex Optimization.

Global Convergence of ADMM in Nonconvex Nonsmooth Optimization

Parallelized ADMM with General Objectives for Deep Learning.

CONVERGENCE OF ADMM FOR OPTIMIZATION PROBLEMS WITH NONSEPARABLE NONCONVEX OBJECTIVE AND LINEAR CONSTRAINTS

An inertial ADMM for a class of nonconvex composite optimization with nonlinear coupling constraints

A Bregman-Style Improved ADMM and its Linearized Version in the Nonconvex Setting: Convergence and Rate Analyses

An Adaptive Proximal ADMM for Nonconvex Linearly-Constrained Composite Programs

Understanding the ADMM Algorithm via High-Resolution Differential Equations

A partial Bregman ADMM with a general relaxation factor for structured nonconvex and nonsmooth optimization

Accelerated Alternating Direction Method of Multipliers: an Optimal O(1/K) Nonergodic Analysis

Adaptive Relaxed ADMM: Convergence Theory and Practical Implementation