Secure and Efficient Federated Learning with Provable Performance Guarantees Via Stochastic Quantization
Xinchen Lyu,Xinyun Hou,Chenshan Ren,Xin Ge,Penglin Yang,Qimei Cui,Xiaofeng Tao
DOI: https://doi.org/10.1109/tifs.2024.3374590
IF: 7.231
2024-01-01
IEEE Transactions on Information Forensics and Security
Abstract:Federated learning is a popular distributed machine learning paradigm that enables collaborative model training at multiple entities via exchanging intermediate learning results. Security and communication efficiency are crucial for successful applications of federated learning in various privacy-sensitive services. However, existing work focused on gradient defense and communication efficiency separately, and also incurred additional computation, signaling, and accuracy overhead. A lightweight (in terms of time-complexity and signaling) technique that simultaneously achieves security and communication efficiency is critical for massive resource-constrained devices (e.g., Internet-of-Things generating the data), but has yet to be established. This paper proposes a secure and efficient federated learning framework with provable communication-accuracy-security performance guarantees. A low-complexity and signaling-free stochastic quantization module is added at the client side that quantizes the original local gradients to discrete values for communication-efficient global aggregation. The stochastic quantization module is shown to be interpreted as triangular or Gaussian-multiply-triangular noises under uniform or Gaussian distributions of local gradients, hence protecting data privacy. We prove that the proposed framework exhibits an { O (log 2 1/δ), O (δ 2 ), O (1/δ)}-tradeoff between the communication overhead, model accuracy, and data protection, where δ is an adjustable quantization interval. Experimental results validate the tradeoff and the superiority of the proposed stochastic quantization technique in terms of communication efficiency (only 14.1% of differential privacy and 0.2% of homomorphic encryption) and computation complexity (similar to differential privacy and only 0.03% of homomorphic encryption). Under the same data protection performance, the proposed approach also outperforms (in terms of accuracy) differential privacy in all the 9 comparison settings on CIFAR10 dataset.