Abstract:Recently, many stochastic variance reduced alternating direction methods of multipliers (ADMMs) (e.g., SAG-ADMM and SVRG-ADMM) have made exciting progress such as linear convergence rate for strongly convex (SC) problems. However, their best-known convergence rate for non-strongly convex (non-SC) problems is $\mathcal {O}(1/T)$O(1/T) as opposed to $\mathcal {O}(1/T^2)$O(1/T2) of accelerated deterministic algorithms, where $T$T is the number of iterations. Thus, there remains a gap in the convergence rates of existing stochastic ADMM and deterministic algorithms. To bridge this gap, we introduce a new momentum acceleration trick into stochastic variance reduced ADMM, and propose a novel accelerated SVRG-ADMM method (called ASVRG-ADMM) for the machine learning problems with the constraint $Ax + By = c$Ax+By=c. Then we design a linearized proximal update rule and a simple proximal one for the two classes of ADMM-style problems with $B = \tau I$B=τI and $B\ne \tau I$B≠τI, respectively, where $I$I is an identity matrix and $\tau$τ is an arbitrary bounded constant. Note that our linearized proximal update rule can avoid solving sub-problems iteratively. Moreover, we prove that ASVRG-ADMM converges linearly for SC problems. In particular, ASVRG-ADMM improves the convergence rate from $\mathcal {O}(1/T)$O(1/T) to $\mathcal {O}(1/T^2)$O(1/T2) for non-SC problems. Finally, we apply ASVRG-ADMM to various machine learning problems, e.g., graph-guided fused Lasso, graph-guided logistic regression, graph-guided SVM, generalized graph-guided fused Lasso and multi-task learning, and show that ASVRG-ADMM consistently converges faster than the state-of-the-art methods.

Accelerated Doubly Stochastic Gradient Algorithm for Large-scale Empirical Risk Minimization

Stochastic Momentum Method with Double Acceleration for Regularized Empirical Risk Minimization

Accelerated Stochastic ADMM with Variance Reduction

Accelerated stochastic admm for empirical risk minimization

Scalable Stochastic Alternating Direction Method of Multipliers.

Optimal Adaptive and Accelerated Stochastic Gradient Descent

Multi-stage stochastic gradient method with momentum acceleration

An Accelerated Doubly Stochastic Gradient Method with Faster Explicit Model Identification

Combining Conjugate Gradient and Momentum for Unconstrained Stochastic Optimization With Applications to Machine Learning

Stagewise Accelerated Stochastic Gradient Methods for Nonconvex Optimization

Accelerated mini-batch stochastic dual coordinate ascent

Demystifying SGD with Doubly Stochastic Gradients

Convergence analysis of an accelerated stochastic admm with larger stepsizes

Efficient Stochastic Gradient Hard Thresholding

Accelerated Variance Reduction Stochastic ADMM for Large-Scale Machine Learning

Accelerated Almost-Sure Convergence Rates for Nonconvex Stochastic Gradient Descent using Stochastic Learning Rates

Stochastic primal-dual method for empirical risk minimization with O (1) per-iteration complexity

Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise

A Stochastic Alternating Direction Method of Multipliers for Non-smooth and Non-convex Optimization

Asynchronous Accelerated Stochastic Gradient Descent.

Convergence on a Symmetric Accelerated Stochastic ADMMwith Larger Stepsizes