On the Convergence of Decentralized Gradient Descent.
Kun Yuan,Qing Ling,Wotao Yin
DOI: https://doi.org/10.1137/130943170
IF: 2.763
2016-01-01
SIAM Journal on Optimization
Abstract:Consider the consensus problem of minimizing f(x) = Sigma(n)(i=1) fi(x), where x is an element of R-p and each f(i) is only known to the individual agent i in a connected network of n agents. To solve this problem and obtain the solution, all the agents collaborate with their neighbors through information exchange. This type of decentralized computation does not need a fusion center, offers better network load balance, and improves data privacy. This paper studies the decentralized gradient descent method [A. Nedic and A. Ozdaglar, IEEE Trans. Automat. Control, 54 (2009), pp. 48-61], in which each agent i updates its local variable x((i)) is an element of R-n by combining the average of its neighbors' with a local negative-gradient step -alpha del f(i)(x((i))). The method is described by the iteration x((i)) (k + 1) <- Sigma(n)(j=1) w(ij)x((j))(k)-alpha del f(i)(x((i))(k)), for each agent i, where w(ij) is nonzero only if i and j are neighbors or i = j and the matrix W = [w(ij)] is an element of R-n (x) (n) is symmetric and doubly stochastic. This paper analyzes the convergence of this iteration and derives its rate of convergence under the assumption that each f(i) is proper closed convex and lower bounded, del f(i) is Lipschitz continuous with constant L-fi > 0, and the stepsize a is fixed. Provided that alpha <= min{(1 + lambda(n)) (W))/L-h, 1/L-(f) over bar}, where L-h = max(i){L-fi} and L-(f) over bar = (1)(n) Sigma(n)(i=1) L-fi, the objective errors of all the local solutions and the networkwide mean solution reduce at rates of O(1/k) until they reach a level of O(alpha). If fi are strongly convex with modulus mu(fi), and alpha <= min{(1+lambda(n)(W))/L-h, 1/(L-f+mu(f))}, where mu(f) - 1/n Sigma(n)(i=1) mu(fi), then all the local solutions and the mean solution converge to the global minimizer x* at a linear rate until reaching an O(alpha)-neighborhood of x*. We also develop an iteration for decentralized basis pursuit and establish its linear convergence to an O(alpha)-neighborhood of the true sparse signal. This analysis reveals how the convergence of x((i))(k + 1) <- Sigma(n)(j=1) wijx(j)(k) - alpha del f(i)(x((i))(k)), for each agent i, depends on the stepsize, function convexity, and network spectrum.