Joint Model Pruning and Topology Construction for Accelerating Decentralized Machine Learning

Zhida Jiang,Yang Xu,Hongli Xu,Lun Wang,Chunming Qiao,Liusheng Huang
DOI: https://doi.org/10.1109/tpds.2023.3303967
IF: 5.3
2023-01-01
IEEE Transactions on Parallel and Distributed Systems
Abstract:Recently, mobile and embedded devices worldwide generate a massive amount of data at the network edge. To efficiently exploit the data from distributed devices, we concentrate on decentralized machine learning (DML), where the workers collaboratively train models under the peer-to-peer (P2P) setting. DML avoids the bottleneck of the parameter server (PS) by enabling the workers to exchange local models with their neighbors rather than the PS. However, DML still faces some key challenges, i.e., resource limitation, system heterogeneity, network dynamics and non-IID data. In this article, we design and implement MOTOR, an efficient DML mechanism that simultaneously addresses these challenges by applying model pruning and topology construction, thus accelerating DML. Specifically, MOTOR assigns different pruning ratios to heterogeneous workers. After model pruning, each worker will train and transmit a sub-model that fits its capabilities, reducing both computation and communication overhead. Besides, MOTOR dynamically constructs the network topology considering the time-varying network conditions and non-IID data distributions. We theoretically analyze the impact of pruning ratio and network topology on model training performance. Guided by the theoretical analysis, we develop a joint optimization algorithm for pruning ratio decision and topology construction to achieve the trade-off between resource overhead and training performance. We implement MOTOR on commercial devices and evaluate the performance with different DML tasks. Extensive experiments show that MOTOR achieves up to 4.2× speedup compared to the existing DML approaches.
What problem does this paper attempt to address?