FedSC: Compatible Gradient Compression for Communication-Efficient Federated Learning

Xinlei Yu,Zhipeng Gao,Chen Zhao,Zijia Mo
DOI: https://doi.org/10.1007/978-981-97-0834-5_21
2024-01-01
Abstract:Federated Learning (FL) communication costs hinge on communication frequency, device count, and per-communication-round costs. Ideally, minimizing these within the device cluster tolerance significantly curtails data traffic. Among various methods to reduce per-communication-round costs, gradient compression stands out. Gradient compression concentrates on the changes in the model rather than the model parameters. This not only prevents synchronization issues caused by varying compression levels across devices, but also incurs only minor precision loss. This makes it especially apt for distributed scenarios like FL. Existing gradient compression methods are tailored to capitalize on elements nearing zero in high-frequency settings, aiming for sparse representation and efficient encoding. Yet, in low-frequency scenarios where elements deviate from zero, this strong emphasis on sparsity results in significant compression errors. To resolve this, we propose Federated Statistical Compression (FedSC) shifts the focus from sparsity. Designed specifically for low-frequency settings, it hones in on the inherent statistical characteristics and relationships among the gradient elements. It extracts statistical information from each gradient, replacing the original gradient sent to the central server. We further introduce hierarchical estimation to improve accuracy, grouping elements in a gradient by layer label. Experimentally, FedSC outperforms most existing gradient compression algorithms for FL, converging with fewer communication bits while maintaining high model accuracy.
What problem does this paper attempt to address?