Abstract:In this paper, we consider the decentralized, stochastic nonconvex strongly-concave (NCSC) minimax problem with nonsmooth regularization terms on both primal and dual variables, wherein a network of $m$ computing agents collaborate via peer-to-peer communications. We consider when the coupling function is in expectation or finite-sum form and the double regularizers are convex functions, applied separately to the primal and dual variables. Our algorithmic framework introduces a Lagrangian multiplier to eliminate the consensus constraint on the dual variable. Coupling this with variance-reduction (VR) techniques, our proposed method, entitled VRLM, by a single neighbor communication per iteration, is able to achieve an $\mathcal{O}(\kappa^3\varepsilon^{-3})$ sample complexity under the general stochastic setting, with either a big-batch or small-batch VR option, where $\kappa$ is the condition number of the problem and $\varepsilon$ is the desired solution accuracy. With a big-batch VR, we can additionally achieve $\mathcal{O}(\kappa^2\varepsilon^{-2})$ communication complexity. Under the special finite-sum setting, our method with a big-batch VR can achieve an $\mathcal{O}(n + \sqrt{n} \kappa^2\varepsilon^{-2})$ sample complexity and $\mathcal{O}(\kappa^2\varepsilon^{-2})$ communication complexity, where $n$ is the number of components in the finite sum. All complexity results match the best-known results achieved by a few existing methods for solving special cases of the problem we consider. To the best of our knowledge, this is the first work which provides convergence guarantees for NCSC minimax problems with general convex nonsmooth regularizers applied to both the primal and dual variables in the decentralized stochastic setting. Numerical experiments are conducted on two machine learning problems. Our code is downloadable from <a class="link-external link-https" href="https://github.com/RPI-OPT/VRLM" rel="external noopener nofollow">this https URL</a>.

Variance-Reduced Accelerated First-Order Methods: Central Limit Theorems and Confidence Statements

Variance-Reduced Accelerated First-order Methods: Central Limit Theorems and Confidence Statements

Accelerated Stochastic ADMM with Variance Reduction

Variance-reduced first-order methods for deterministically constrained stochastic nonconvex optimization with strong convergence guarantees

Stochastic Sub-Sampled Newton Method with Variance Reduction

Variance-reduced accelerated methods for decentralized stochastic double-regularized nonconvex strongly-concave minimax problems

Variance Reduction via Accelerated Dual Averaging for Finite-Sum Optimization

Universality of AdaGrad Stepsizes for Stochastic Optimization: Inexact Oracle, Acceleration and Variance Reduction

Convergence of Distributed Stochastic Variance Reduced Methods Without Sampling Extra Data

First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities

Stochastic Variance-Reduced Newton: Accelerating Finite-Sum Minimization with Large Batches

Convergence Analysis of Accelerated Stochastic Gradient Descent under the Growth Condition

Accelerated First-Order Optimization Algorithms for Machine Learning.

Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm

Variational Analysis Perspective on Linear Convergence of Some First Order Methods for Nonsmooth Convex Optimization Problems

Linear Convergence of Variance-Reduced Stochastic Gradient without Strong Convexity

Zeroth-order Gradient and Quasi-Newton Methods for Nonsmooth Nonconvex Stochastic Optimization

Gradient tracking and variance reduction for decentralized optimization and machine learning

Shuffling Gradient Descent-Ascent with Variance Reduction for Nonconvex-Strongly Concave Smooth Minimax Problems

Statistical Inference for Polyak-Ruppert Averaged Zeroth-order Stochastic Gradient Algorithm

Accelerated First-Order Optimization under Nonlinear Constraints