GGS: General Gradient Sparsification for Federated Learning in Edge Computing*

Shiqi Li,Qi,Jingyu Wang,Haifeng Sun,Yujian Li,F. Richard Yu
DOI: https://doi.org/10.1109/icc40277.2020.9148987
2020-01-01
Abstract:Federated learning is an emerging concept that trains the machine learning models with the distributed datasets, without sending the raw data to the data center. But in an edge computing enviroment where the wireless network resource is constrained, the key problem of federated learning is the communication overhead for parameters synchronization, which wastes bandwidth, increases training time, and even impacts the model accuracy. Gradient sparsification has received increasing attention, which only updates significant gradients and accumulates insignificant gradients locally. However, how to preserve the accuracy after a high ratio sparsification has always been ignored. In this paper, a General Gradient Sparsification (GGS) framework is proposed for adaptive optimizers, to correct the sparse gradient update process. It consists of two important mechanisms: gradient correction and batch normalization update with local gradients (BN-LG). With gradient correction, the optimizer can properly treat the accumulated insignificant gradients, which makes the model converge better. Furthermore, updating the batch normalization layer with local gradients can relieve the impact of delayed gradients without increasing the communication overhead. We have conducted experiments on LeNet-5, CifarNet, DenseNet-121, and AlexNet with adaptive optimizers. Results show that when 99.9% gradients are sparsified, validation datasets are maintained with top-l accuracy.
What problem does this paper attempt to address?