Accelerating Federated Learning with Adaptive Extra Local Updates Upon Edge Networks

Yitao Fan,Mingtao Ji,Zhuzhong Qian
DOI: https://doi.org/10.1109/icpads60453.2023.00344
2023-01-01
Abstract:Delayed Gradient Averaging (DGA) has gained massive attention for improving the training efficiency of Federated Learning (FL) at edge networks, by allowing local computation in parallel to communication. However, it faces multiple challenges due to data distribution across heterogeneous edge devices and dynamic network environments. To address these challenges, we present A-DGA, a novel communication learning parallel federated learning algorithm designed for fluctuating, heterogeneous, and high-latency network conditions. Our proposed A-DGA dynamically sets extra local updates based on network status, avoiding inefficient training upon the outdated gradients. Theoretical analysis demonstrates that A-DGA outperforms DGA in terms of convergence rates given the fixed time slot length. We further conduct massive evaluations under various conditions, including different datasets, models, and data distribution. The experimental results show that A-DGA performs the best, compared to the state-of-the-art methods, achieving an acceleration factor of approximately x2 ∼ x4 and x1.2 ∼ x2, compared to FedAvg and DGA, respectively. Besides, our A-DGA also reduces energy consumption by about 40% ∼ 50% compared to DGA.
What problem does this paper attempt to address?