Abstract:In federated learning, particularly in cross-device scenarios, secure aggregation has recently gained popularity as it effectively defends against inference attacks by malicious aggregators. However, secure aggregation often requires additional communication overhead and can impede the convergence rate of the global model, which is particularly challenging in wireless network environments with extremely limited bandwidth. Therefore, achieving efficient communication compression under the premise of secure aggregation presents a highly challenging and valuable problem. In this work, we propose a novel uplink communication compression method for federated learning, named FedMPQ, which is based on multi shared codebook product quantization.Specifically, we utilize updates from the previous round to generate sufficiently robust codebooks. Secure aggregation is then achieved through trusted execution environments (TEE) or a trusted third party (TTP).In contrast to previous works, our approach exhibits greater robustness in scenarios where data is not independently and identically distributed (non-IID) and there is a lack of sufficient public data. The experiments conducted on the LEAF dataset demonstrate that our proposed method achieves 99% of the baseline's final accuracy, while reducing uplink communications by 90-95%
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to achieve efficient and secure communication compression in federated learning, especially in cross - device scenarios. Specifically, the paper focuses on how to reduce the uplink traffic while ensuring the accuracy of the model and ensuring the privacy and security of user data. This problem is particularly prominent in wireless network environments because these environments usually have extremely low bandwidth, increasing the challenges of communication overhead and model convergence speed.
### Background and Challenges
1. **Communication Efficiency**: In cross - device federated learning, uplink communication efficiency is particularly important, especially the efficiency during the upload process. Since the uplink bandwidth of mobile networks is usually 8 - 20 times lower than the downlink bandwidth, the parameter upload process is prone to high congestion.
2. **Client State**: Each client is considered stateless, that is, they may only participate in one or a few non - consecutive training processes.
3. **Limited Computing Power**: The computing power of edge devices is limited and it is difficult to bear excessive computing overhead.
4. **Non - IID (Non - Independent and Identically Distributed) Data**: The data of clients is usually non - independent and identically distributed and highly sensitive, and privacy needs to be protected.
### Limitations of Existing Methods
Although the existing gradient compression algorithms perform well in data - center training, they often lack customized adaptability in federated learning. This not only affects the convergence speed and accuracy of the model, but may also introduce privacy vulnerabilities. For example, the traditional Product Quantization (PQ) method is not flexible enough in controlling the compression rate and cannot meet the requirements under different network conditions.
### Solutions Proposed in the Paper
The paper proposes a new uplink communication compression method - FedMPQ (Federated Multi - codebook Product Quantization) to achieve efficient and secure communication compression through multi - codebook product quantization. The specific contributions are as follows:
1. **Multi - codebook Generation**: Combine local public data and client updates to generate multiple shared codebooks. In each training round, the client not only uploads the compressed update, but also uploads the pseudo - codebook, and the server generates multiple shared codebooks based on these pseudo - codebooks.
2. **Residual Error Pruning**: Introduce a pruning - based residual error compression strategy, allowing the client to flexibly control the compression rate, thereby reducing the uplink traffic while ensuring data security.
3. **Secure Aggregation**: Perform secure aggregation in the compression domain through a Trusted Execution Environment (TEE) or a Trusted Third Party (TTP) to ensure that no party can reconstruct the original update and prevent gradient leakage.
### Experimental Results
The experimental results show that FedMPQ has a higher convergence speed compared to other methods at a similar compression rate, and does not significantly reduce the model accuracy. Especially in the case of non - independent and identically distributed data, FedMPQ performs better. Specifically:
- On the CelebA and Femnist datasets, FedMPQ achieves approximately 25 - fold uplink communication compression while maintaining high model accuracy.
- Even at extreme compression ratios, FedMPQ is still significantly better than other methods.
### Summary
By proposing the FedMPQ method, the paper successfully solves the problem of efficient and secure communication compression in federated learning, especially in cross - device scenarios. This method not only improves communication efficiency, but also ensures the security of data privacy, providing strong support for large - scale federated learning in practical applications.