Differentially Private Federated Learning on Non-iid Data: Convergence Analysis and Adaptive Optimization

Lin Chen,Xiaofeng Ding,Zhifeng Bao,Pan Zhou,Hai Jin
DOI: https://doi.org/10.1109/tkde.2024.3379001
IF: 9.235
2024-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:Federated learning (FL) has attracted increasing attention in recent years due to its data privacy preservation and great applicability to large-scale user scenarios. However, when FL faces numerous clients, it is inevitable to emerge the non-independent and identically distributed (non-iid) data between clients, which brings an enormous challenge for model training and performance analysis like convergence. Besides, due to the non-iid data, the participating clients of FL tend to be extremely heterogeneous so the number of samplings among clients causes a sampling variance problem, which induces a huge variation in convergence. More importantly, although FL can foster privacy security via locally retaining the training data, if local data is secret and sensitive, FL should have more powerful privacy protection to resist the cloud server or third party to infer private information from shared models or intermediate gradients. Facing the non-iid and privacy challenges, we propose a differential privacy (DP) based non-iid FL algorithm called DPNFL to jointly tackle these two issues. Specifically, motivated by the DP and its variants, we are the first to adopt the truncated concentrated differential privacy technique under the FL scenario to more tightly track end-to-end privacy loss, while requiring less noise injection for the same level of DP. To avoid the sampling variance problem, we enable the server to sample the partial clients uniformly without replacement, which also guarantees unbiased sampling. To further improve the algorithm performance, we also propose an adaptive version of DPNFL named AdDPNFL, which adopts the adaptive optimization on the server-side to simultaneously alleviate the impact of non-iid data and DP noise on model utility. Finally, we perform extensive experiments to validate the effectiveness and superiority of our algorithms.
What problem does this paper attempt to address?