Distributed Learning for Large-Scale Models at Edge With Privacy Protection
Yuan Yuan,Shuzhen Chen,Dongxiao Yu,Zengrui Zhao,Yifei Zou,Lizhen Cui,Xiuzhen Cheng
DOI: https://doi.org/10.1109/tc.2024.3352814
IF: 3.183
2024-03-15
IEEE Transactions on Computers
Abstract:Big data and strong computing power have promoted artificial intelligence to the era of big models. In particular, ChatGPT's debut heralded the vigorous development of large models. It is an urgent problem to train large models with trillion-level parameters efficiently. Traditional single-machine training stores all data and model parameters in memory. However, due to the limitation of memory and communication resources, when the amount of data or model parameters increases, the problem of memory shortage and communication blocking often occurs. Therefore, distributed training is the most effective ways to solve the above problems and improve training efficiency. In this paper, we propose the algorithm DL-DP, which can achieve an asymptotically optimal convergence rate O(1/TKΓ∗) while satisfying ε-differential privacy, where T is the local epoch number, K is the global maximum iteration number and Γ∗ is the minimum covering index. In particular, when Γ∗=N, DL-DP achieves a convergence rate of O(1/TKN), which is equivalent to the best-known FedAvg approach implemented by training the full model at each client. When Γ∗=1, DL-DP achieves a convergence rate of O(1/TK), which is comparable to OAP that assumes all parameters need to be trained at least once in each iteration. Finally, our algorithm has been demonstrated to converge through extensive experiments.
engineering, electrical & electronic,computer science, hardware & architecture