Abstract:We propose a stochastic GDA (gradient descent ascent) method with backtracking (SGDA-B) to solve nonconvex-(strongly) concave (NCC) minimax problems $\min_x \max_y \sum_{i=1}^N g_i(x_i)+f(x,y)-h(y)$, where $h$ and $g_i$ for $i = 1, \ldots, N$ are closed, convex functions, $f$ is $L$-smooth and $\mu$-strongly concave in $y$ for some $\mu\geq 0$. We consider two scenarios: (i) the deterministic setting where we assume one can compute $\nabla f$ exactly, and (ii) the stochastic setting where we have only access to $\nabla f$ through an unbiased stochastic oracle with a finite variance. While most of the existing methods assume knowledge of the Lipschitz constant $L$, SGDA-B is agnostic to $L$. Moreover, SGDA-B can support random block-coordinate updates. In the deterministic setting, SGDA-B can compute an $\epsilon$-stationary point within $\mathcal{O}(L\kappa^2/\epsilon^2)$ and $\mathcal{O}(L^3/\epsilon^4)$ gradient calls when $\mu>0$ and $\mu=0$, respectively, where $\kappa=L/\mu$. In the stochastic setting, for any $p \in (0, 1)$ and $\epsilon >0$, it can compute an $\epsilon$-stationary point with high probability, which requires $\mathcal{O}(L\kappa^3\epsilon^{-4}\log(1/p))$ and $\tilde{\mathcal{O}}(L^4\epsilon^{-7}\log(1/p))$ stochastic oracle calls, with probability at least $1-p$, when $\mu>0$ and $\mu=0$, respectively. To our knowledge, SGDA-B is the first GDA-type method with backtracking to solve NCC minimax problems and achieves the best complexity among the methods that are agnostic to $L$. We also provide numerical results for SGDA-B on a distributionally robust learning problem illustrating the potential performance gains that can be achieved by SGDA-B.

Solving Log-Determinant Optimization Problems by a Newton-CG Primal Proximal Point Algorithm

A Proximal Point Algorithm For Log-Determinant Optimization With Group Lasso Regularization

A Semismooth Newton-CG Based Dual PPA for Matrix Spectral Norm Approximation Problems

A Primal Majorized Semismooth Newton-CG Augmented Lagrangian Method for Large-Scale Linearly Constrained Convex Programming

A Riemannian Proximal Newton-CG Method

A globally convergent difference-of-convex algorithmic framework and application to log-determinant optimization problems

An adaptive proximal point algorithm framework and application to large-scale optimization

GLOBAL OPTIMIZATION FOR NON-CONVEX PROGRAMS VIA CONVEX PROXIMAL POINT METHOD

Generalized Newton Method with Positive Definite Regularization for Nonsmooth Optimization Problems with Nonisolated Solutions

Subspace Newton method for sparse group optimization problem

A Sparse Semismooth Newton Based Proximal Majorization-Minimization Algorithm for Nonconvex Square-Root-loss Regression Problems

On the Differentiability of the Primal-Dual Interior-Point Method

Primal Dual Alternating Proximal Gradient Algorithms for Nonsmooth Nonconvex Minimax Problems with Coupled Linear Constraints

A Stochastic GDA Method With Backtracking For Solving Nonconvex (Strongly) Concave Minimax Problems

GRPDA Revisited: Relaxed Condition and Connection to Chambolle-Pock's Primal-Dual Algorithm.

Fast stochastic second-order method logarithmic in condition number

An Alternating Manifold Proximal Gradient Method for Sparse Principal Component Analysis and Sparse Canonical Correlation Analysis

A Proximal-Proximal Majorization-Minimization Algorithm for Nonconvex Rank Regression Problems.

A Proximal-Gradient Method for Constrained Optimization

A Newton-CG Augmented Lagrangian Method for Semidefinite Programming

A primal-dual interior-point relaxation method with global and rapidly local convergence for nonlinear programs