Abstract:In this paper, we consider the decentralized, stochastic nonconvex strongly-concave (NCSC) minimax problem with nonsmooth regularization terms on both primal and dual variables, wherein a network of $m$ computing agents collaborate via peer-to-peer communications. We consider when the coupling function is in expectation or finite-sum form and the double regularizers are convex functions, applied separately to the primal and dual variables. Our algorithmic framework introduces a Lagrangian multiplier to eliminate the consensus constraint on the dual variable. Coupling this with variance-reduction (VR) techniques, our proposed method, entitled VRLM, by a single neighbor communication per iteration, is able to achieve an $\mathcal{O}(\kappa^3\varepsilon^{-3})$ sample complexity under the general stochastic setting, with either a big-batch or small-batch VR option, where $\kappa$ is the condition number of the problem and $\varepsilon$ is the desired solution accuracy. With a big-batch VR, we can additionally achieve $\mathcal{O}(\kappa^2\varepsilon^{-2})$ communication complexity. Under the special finite-sum setting, our method with a big-batch VR can achieve an $\mathcal{O}(n + \sqrt{n} \kappa^2\varepsilon^{-2})$ sample complexity and $\mathcal{O}(\kappa^2\varepsilon^{-2})$ communication complexity, where $n$ is the number of components in the finite sum. All complexity results match the best-known results achieved by a few existing methods for solving special cases of the problem we consider. To the best of our knowledge, this is the first work which provides convergence guarantees for NCSC minimax problems with general convex nonsmooth regularizers applied to both the primal and dual variables in the decentralized stochastic setting. Numerical experiments are conducted on two machine learning problems. Our code is downloadable from <a class="link-external link-https" href="https://github.com/RPI-OPT/VRLM" rel="external noopener nofollow">this https URL</a>.

Variance Reduced Distributed Non-Convex Optimization Using Matrix Stepsizes

Det-CGD: Compressed Gradient Descent with Matrix Stepsizes for Non-Convex Optimization

Augmented Distributed Gradient Methods for Multi-Agent Optimization under Uncoordinated Constant Stepsizes

CEDAS: A Compressed Decentralized Stochastic Gradient Method with Improved Convergence

Distributed Multi-Step Subgradient Optimization for Multi-Agent System

Scaling up stochastic gradient descent for non-convex optimisation

Global Convergence of Non-Convex Gradient Descent for Computing Matrix Squareroot

Accelerating Distributed Optimization: A Primal-Dual Perspective on Local Steps

A Communication-Efficient Stochastic Gradient Descent Algorithm for Distributed Nonconvex Optimization

Decentralized Local Updates with Dual-Slow Estimation and Momentum-Based Variance-Reduction for Non-Convex Optimization

On Nonconvex Decentralized Gradient Descent

Variance Reduced EXTRA and DIGing and Their Optimal Acceleration for Strongly Convex Decentralized Optimization

Variable Metric Proximal Gradient Method with Diagonal Barzilai-Borwein Stepsize

Achieving Near-Optimal Convergence for Distributed Minimax Optimization with Adaptive Stepsizes

Universal Gradient Descent Ascent Method for Nonconvex-Nonconcave Minimax Optimization

Dual Descent Augmented Lagrangian Method and Alternating Direction Method of Multipliers

A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems

Efficient Sign-Based Optimization: Accelerating Convergence via Variance Reduction

Variance-reduced accelerated methods for decentralized stochastic double-regularized nonconvex strongly-concave minimax problems

A Modified Dai–Liao Conjugate Gradient Method Based on a Scalar Matrix Approximation of Hessian and Its Application