Privacy-preserving clustering federated learning for non-IID data

Guixun Luo,Naiyue Chen,Jiahuan He,Bingwei Jin,Zhiyuan Zhang,Yidong Li
DOI: https://doi.org/10.1016/j.future.2024.01.005
IF: 7.307
2024-01-06
Future Generation Computer Systems
Abstract:With the increasing number of intelligent devices joining into the Internet of Things (IoT), traditional centralized learning struggles to meet the performance requirements of terminal time-critical systems under heterogeneous data distribution. This challenge arises from the non-independent and non-identically distributed nature of data on terminal devices in real-world scenarios, which impacts the overall model convergence speed and terminal performance. As federated learning provides a privacy-preserving collaborative training framework, this paper focuses on the studying of the time response and performance issues in the context of data heterogeneity. In this paper, we propose a lightweight Randomized Response (RR) differential privacy method to protect the distribution characteristics of clients' data while quantifying their similarity. The paper introduces a community detection algorithm with linear time complexity to divide clients into clusters, which addresses inherent non-IID challenges in federated learning and meeting the rapid response requirements of time-critical systems. We conduct experiments on scenarios with different data distribution scenarios. The results show that the privacy-preserving mechanism has a negligible impact on model accuracy, and our algorithm demonstrates significant performance improvements in personalization compared to baseline methods. Additionally, in the presence of partially disconnected clients during training, compared to solo training, the pp-CFL algorithm enhances the timeliness and accuracy of the personalized local model.
computer science, theory & methods
What problem does this paper attempt to address?