Magnitude Matters: Fixing SIGNSGD Through Magnitude-Aware Sparsification and Error Feedback in the Presence of Data Heterogeneity

Richeng Jin,Xiaofan He,Caijun Zhong,Zhaoyang Zhang,Tony Q.S. Quek,Huaiyu Dai
DOI: https://doi.org/10.1109/tsp.2024.3454986
IF: 4.875
2024-01-01
IEEE Transactions on Signal Processing
Abstract:Communication overhead has become one of the major bottlenecks in the distributed training of deep neural networks. To alleviate the concern, various gradient compression methods have been proposed, and sign-based algorithms are of surging interest. However, SIGNSGD fails to converge in the presence of data heterogeneity, which is commonly observed in the emerging federated learning (FL) paradigm. Error feedback has been proposed to address the non-convergence issue. Nonetheless, it requires the workers to locally keep track of the compression errors, which renders it not suitable for FL since the workers may not participate in the training throughout the learning process. In this paper, we propose a magnitude-driven sparsification scheme, which addresses the non-convergence issue of SIGNSGD while further improving communication efficiency. Moreover, the local update and the error feedback schemes are further incorporated to improve the learning performance (i.e., test accuracy and communication efficiency), and the convergence of the proposed method is established. The effectiveness of the proposed scheme is validated through extensive experiments on Fashion-MNIST, CIFAR-10, CIFAR-100, Tiny-ImageNet, and Mini-ImageNet datasets.
What problem does this paper attempt to address?