Abstract:Nonconvex-nonconcave minimax optimization has received intense attention over the last decade due to its broad applications in machine learning. Most existing algorithms rely on one-sided information, such as the convexity (resp. concavity) of the primal (resp. dual) functions, or other specific structures, such as the Polyak-Łojasiewicz (PŁ) and Kurdyka-Łojasiewicz (KŁ) conditions. However, verifying these regularity conditions is challenging in practice. To meet this challenge, we propose a novel universally applicable single-loop algorithm, the doubly smoothed gradient descent ascent method (DS-GDA), which naturally balances the primal and dual updates. That is, DS-GDA with the same hyperparameters is able to uniformly solve nonconvex-concave, convex-nonconcave, and nonconvex-nonconcave problems with one-sided KŁ properties, achieving convergence with $\mathcal{O}(\epsilon^{-4})$ complexity. Sharper (even optimal) iteration complexity can be obtained when the KŁ exponent is known. Specifically, under the one-sided KŁ condition with exponent $\theta\in(0,1)$, DS-GDA converges with an iteration complexity of $\mathcal{O}(\epsilon^{-2\max\{2\theta,1\}})$. They all match the corresponding best results in the literature. Moreover, we show that DS-GDA is practically applicable to general nonconvex-nonconcave problems even without any regularity conditions, such as the PŁ condition, KŁ condition, or weak Minty variational inequalities condition. For various challenging nonconvex-nonconcave examples in the literature, including ``Forsaken'', ``Bilinearly-coupled minimax'', ``Sixth-order polynomial'', and ``PolarGame'', the proposed DS-GDA can all get rid of limit cycles. To the best of our knowledge, this is the first first-order algorithm to achieve convergence on all of these formidable problems.

MGDA Converges under Generalized Smoothness, Provably

Direction-oriented Multi-objective Learning: Simple and Provable Stochastic Algorithms

PSMGD: Periodic Stochastic Multi-Gradient Descent for Fast Multi-Objective Optimization

A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems

On the Convergence of Stochastic Multi-Objective Gradient Manipulation and Beyond

Faster single-loop algorithms for minimax optimization without strong concavity

Augmented Distributed Gradient Methods for Multi-Agent Optimization under Uncoordinated Constant Stepsizes

Universal Gradient Descent Ascent Method for Nonconvex-Nonconcave Minimax Optimization

A Stochastic GDA Method With Backtracking For Solving Nonconvex (Strongly) Concave Minimax Problems

Mirror descent method for stochastic multi-objective optimization

Gradient-based algorithms for multi-objective bi-level optimization

Conflict-Averse Gradient Descent for Multi-task Learning

On the Convergence Theory of Gradient-Based Model-Agnostic Meta-Learning Algorithms

Shuffling Gradient Descent-Ascent with Variance Reduction for Nonconvex-Strongly Concave Smooth Minimax Problems

Generalized-Smooth Nonconvex Optimization is As Efficient As Smooth Nonconvex Optimization

Conflict-Averse Gradient Aggregation for Constrained Multi-Objective Reinforcement Learning

Learning to optimize by multi-gradient for multi-objective optimization

Three-Way Trade-Off in Multi-Objective Learning: Optimization, Generalization and Conflict-Avoidance

Gradient-Variation Online Learning under Generalized Smoothness

Single-Loop Stochastic Algorithms for Difference of Max-Structured Weakly Convex Functions