FedBAT: Communication-Efficient Federated Learning via Learnable Binarization

Shiwei Li,Wenchao Xu,Haozhao Wang,Xing Tang,Yining Qi,Shijie Xu,Weihong Luo,Yuhua Li,Xiuqiang He,Ruixuan Li
2024-08-06
Abstract:Federated learning is a promising distributed machine learning paradigm that can effectively exploit large-scale data without exposing users' privacy. However, it may incur significant communication overhead, thereby potentially impairing the training efficiency. To address this challenge, numerous studies suggest binarizing the model updates. Nonetheless, traditional methods usually binarize model updates in a post-training manner, resulting in significant approximation errors and consequent degradation in model accuracy. To this end, we propose Federated Binarization-Aware Training (FedBAT), a novel framework that directly learns binary model updates during the local training process, thus inherently reducing the approximation errors. FedBAT incorporates an innovative binarization operator, along with meticulously designed derivatives to facilitate efficient learning. In addition, we establish theoretical guarantees regarding the convergence of FedBAT. Extensive experiments are conducted on four popular datasets. The results show that FedBAT significantly accelerates the convergence and exceeds the accuracy of baselines by up to 9\%, even surpassing that of FedAvg in some cases.
Machine Learning,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?
The paper primarily addresses the issue of communication efficiency in Federated Learning (FL) by proposing a new solution. While protecting user data privacy, Federated Learning faces the challenge of high communication overhead due to the frequent transmission of model parameters, which significantly affects training efficiency. To tackle this challenge, the paper introduces "Federated Binarization-Aware Training" (FedBAT). ### Research Background and Problem - **Basic Framework of Federated Learning**: Federated Learning allows multiple clients to collaboratively train a global model without sharing local data. - **Communication Overhead Issue**: Although Federated Learning can protect data privacy, the iterative transmission of model parameters brings a large amount of communication overhead, which affects training efficiency. - **Existing Solutions**: A popular method to reduce communication volume is to use binarization techniques (such as SignSGD), which convert model updates into binary form for transmission. However, this post-processing approach leads to significant approximation errors, thereby affecting model accuracy and convergence speed. ### Main Contributions - **FedBAT Framework**: FedBAT is a novel approach that directly learns binarized model updates during local training, thereby reducing approximation errors. It achieves efficient learning by introducing innovative binarization operators and carefully designed gradient computation methods. - **Theoretical Guarantee**: The paper provides theoretical guarantees on the convergence of FedBAT, proving that it has a comparable convergence rate to the uncompressed version of the Federated Averaging algorithm (FedAvg). - **Experimental Validation**: Extensive experiments on four popular datasets (FMNIST, SVHN, CIFAR-10, and CIFAR-100) show that FedBAT can significantly improve convergence speed and, in some cases, even surpass the accuracy of FedAvg. ### Technical Details - **Binarization Operator**: The paper defines a differentiable binarization operator, allowing optimization of model updates and step sizes. - **Learning Step Sizes**: The paper proposes a mechanism to learn the step sizes for each layer to further enhance model performance. - **Local Training Process**: FedBAT includes two stages of local training: full-precision training and binarization-aware training, the latter directly considers binarization errors and corrects them during the local optimization process. ### Conclusion In summary, the paper proposes a new Federated Learning framework, FedBAT, aimed at improving communication efficiency by directly learning binarized model updates during local training. Experimental results demonstrate the effectiveness of FedBAT, particularly in improving convergence speed and model accuracy.