Abstract:In this paper, we consider the decentralized, stochastic nonconvex strongly-concave (NCSC) minimax problem with nonsmooth regularization terms on both primal and dual variables, wherein a network of $m$ computing agents collaborate via peer-to-peer communications. We consider when the coupling function is in expectation or finite-sum form and the double regularizers are convex functions, applied separately to the primal and dual variables. Our algorithmic framework introduces a Lagrangian multiplier to eliminate the consensus constraint on the dual variable. Coupling this with variance-reduction (VR) techniques, our proposed method, entitled VRLM, by a single neighbor communication per iteration, is able to achieve an $\mathcal{O}(\kappa^3\varepsilon^{-3})$ sample complexity under the general stochastic setting, with either a big-batch or small-batch VR option, where $\kappa$ is the condition number of the problem and $\varepsilon$ is the desired solution accuracy. With a big-batch VR, we can additionally achieve $\mathcal{O}(\kappa^2\varepsilon^{-2})$ communication complexity. Under the special finite-sum setting, our method with a big-batch VR can achieve an $\mathcal{O}(n + \sqrt{n} \kappa^2\varepsilon^{-2})$ sample complexity and $\mathcal{O}(\kappa^2\varepsilon^{-2})$ communication complexity, where $n$ is the number of components in the finite sum. All complexity results match the best-known results achieved by a few existing methods for solving special cases of the problem we consider. To the best of our knowledge, this is the first work which provides convergence guarantees for NCSC minimax problems with general convex nonsmooth regularizers applied to both the primal and dual variables in the decentralized stochastic setting. Numerical experiments are conducted on two machine learning problems. Our code is downloadable from <a class="link-external link-https" href="https://github.com/RPI-OPT/VRLM" rel="external noopener nofollow">this https URL</a>.

A Zeroth-Order Variance-Reduced Method for Decentralized Stochastic Non-convex Optimization

Distributed Zeroth-Order Optimization: Convergence Rates That Match Centralized Counterpart

Zeroth-order algorithms for stochastic distributed nonconvex optimization

Variance-Reduced Gradient Estimator for Nonconvex Zeroth-Order Distributed Optimization

Single Point-Based Distributed Zeroth-Order Optimization with a Non-Convex Stochastic Objective Function

Stochastic Zeroth-order Optimization Via Variance Reduction Method.

Variance-Reduced Stochastic Quasi-Newton Methods for Decentralized Learning: Part I

An Optimal Stochastic Algorithm for Decentralized Nonconvex Finite-sum Optimization

Online Optimization Perspective on First-Order and Zero-Order Decentralized Nonsmooth Nonconvex Stochastic Optimization

Improving the Sample and Communication Complexity for Decentralized Non-Convex Optimization: A Joint Gradient Estimation and Tracking Approach

Zeroth-Order Non-Convex Optimization for Cooperative Multi-Agent Systems with Diminishing Step Size and Smoothing Radius

Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient

Obtaining Lower Query Complexities Through Lightweight Zeroth-Order Proximal Gradient Algorithms

Distributed Stochastic Consensus Optimization With Momentum for Nonconvex Nonsmooth Problems

Variance-reduced accelerated methods for decentralized stochastic double-regularized nonconvex strongly-concave minimax problems

Finite-time Distributed ConvexOptimization with Zero-Gradient-Sum Algorithms

Stochastic Nested Variance Reduction for Nonconvex Optimization

Linear Convergence of First- and Zeroth-Order Primal-Dual Algorithms for Distributed Nonconvex Optimization

Privacy-Preserved Distributed Learning With Zeroth-Order Optimization

On the Divergence of Decentralized Non-Convex Optimization