Adaptive Consensus: A network pruning approach for decentralized optimization

Suhail M. Shah,Albert S. Berahas,Raghu Bollapragada
2023-09-06
Abstract:We consider network-based decentralized optimization problems, where each node in the network possesses a local function and the objective is to collectively attain a consensus solution that minimizes the sum of all the local functions. A major challenge in decentralized optimization is the reliance on communication which remains a considerable bottleneck in many applications. To address this challenge, we propose an adaptive randomized communication-efficient algorithmic framework that reduces the volume of communication by periodically tracking the disagreement error and judiciously selecting the most influential and effective edges at each node for communication. Within this framework, we present two algorithms: Adaptive Consensus (AC) to solve the consensus problem and Adaptive Consensus based Gradient Tracking (AC-GT) to solve smooth strongly convex decentralized optimization problems. We establish strong theoretical convergence guarantees for the proposed algorithms and quantify their performance in terms of various algorithmic parameters under standard assumptions. Finally, numerical experiments showcase the effectiveness of the framework in significantly reducing the information exchange required to achieve a consensus solution.
Optimization and Control,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to reduce the communication requirements in decentralized optimization while maintaining the convergence properties of the algorithm. Specifically, the author focuses on network - based decentralized optimization problems, where each node has a local function, and the goal is to collectively reach a consensus solution that minimizes the sum of all local functions. In decentralized optimization, communication is an important bottleneck because the information exchange between nodes consumes a large amount of resources. To address this challenge, the author proposes an adaptive and communication - efficient random algorithm framework. By periodically tracking the inconsistency error and selecting the most effective and influential edges on each node for communication, the amount of communication is reduced. Under this framework, the author proposes two algorithms: Adaptive Consensus (AC) for solving the consensus problem, and Adaptive Consensus based Gradient Tracking (AC - GT) for solving the smooth and strongly convex decentralized optimization problem. ### Main contributions of the paper 1. **Proposed an adaptive and communication - efficient algorithm framework**: - Two new algorithms, AC and AC - GT, are introduced within this framework. - These algorithms reduce the amount of communication by utilizing the network structure, specifically by selecting the most effective and influential edges on each node for communication. - This framework has wide applicability and can be combined with other existing decentralized optimization algorithms and can also adapt to other settings, such as directed graphs, time - varying topologies, and asynchronous updates. 2. **Provided theoretical convergence guarantees**: - For smooth and strongly convex problems, the AC and AC - GT algorithms retain the linear convergence properties of their underlying algorithms while reducing the communication requirements. - The analysis utilizes the theory of non - homogeneous matrix products and proves that the pruned matrix product still has contractivity. - Different from existing analysis methods, the rate constants in the results are obtained through traversal coefficients, which effectively highlights the dependence of the convergence rate on the network pruning parameters. 3. **Demonstrated the empirical performance of the algorithms**: - Through numerical experiments, the effectiveness of AC in solving the standard consensus problem and AC - GT in solving the linear regression and binary classification logistic regression problems is demonstrated. - The experimental results show that the proposed algorithms significantly reduce the communication overhead while maintaining the quality of the solution. ### Background of the paper - **Decentralized optimization**: Decentralized optimization problems are widespread in fields such as wireless sensor networks, power system design, parallel computing, and robotics. - **Communication bottleneck**: In many applications, the communication requirements are the main bottleneck for the performance of decentralized optimization methods. - **Existing methods**: There are already a variety of communication - efficient algorithms, but most lack strict convergence guarantees or require additional assumptions. ### Method overview - **Network model**: Assume that the network is modeled by an undirected graph \(G = \{V, E\}\), where \(V\) is the set of nodes and \(E\) is the set of edges. - **Pruning protocol**: Select the edges to be pruned through an adaptive random method to reduce the amount of communication. - **Adaptive consensus algorithm**: Execute the pruning protocol at the beginning of each consensus cycle, and then use the pruned weights for decentralized averaging. ### Mathematical representation - **Objective function**: Minimize the sum of all local functions \[ \min_{x_i \in \mathbb{R}^d} \frac{1}{n} \sum_{i = 1}^n f_i(x_i)\quad \text{s.t.}\quad x_i = x_j,\forall i, j\in [n] \] - **Consensus problem**: Make the estimates of all nodes consistent \[ x_i = x_j,\forall i, j\in [n] \] - **Mixing matrix**: Defined as \(Q = [q_{ij}]_{i, j\in [n]}\), where \(q_{ij}>0\) indicates that there is a connection between node \(i\) and node \(j\). - **Spectral gap**