Abstract:This paper aims to design a Privacy-aware Client Sampling framework in Federated learning, named FedPCS, to tackle the heterogeneous client sampling issues and improve model performance. First, we obtain a pioneering upper bound for the accuracy loss of the FL model with privacy-aware client sampling probabilities. Based on this, we model the interactions between the central server and participating clients as a two-stage Stackelberg game. In Stage I, the central server designs the optimal time-dependent reward for cost minimization by considering the trade-off between the accuracy loss of the FL model and the rewards allocated. In Stage II, each client determines the correction factor that dynamically adjusts its privacy budget based on the reward allocated to maximize its utility. To surmount the obstacle of approximating other clients' private information, we introduce the mean-field estimator to estimate the average privacy budget. We analytically demonstrate the existence and convergence of the fixed point for the mean-field estimator and derive the Stackelberg Nash Equilibrium to obtain the optimal strategy profile. By rigorously theoretical convergence analysis, we guarantee the robustness of FedPCS. Moreover, considering the conventional sampling strategy in privacy-preserving FL, we prove that the random sampling approach's PoA can be arbitrarily large. To remedy such efficiency loss, we show that the proposed privacy-aware client sampling strategy successfully reduces PoA, which is upper bounded by a reachable constant. To address the challenge of varying privacy requirements throughout different training phases in FL, we extend our model and analysis and derive the adaptive optimal sampling ratio for the central server. Experimental results on different datasets demonstrate the superiority of FedPCS compared with the existing SOTA FL strategies under IID and Non-IID datasets.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is how to optimize the client sampling strategy to improve model performance under the premise of privacy protection in the Federated Learning (FL) system. Specifically, the paper aims to design a privacy - aware client sampling framework (FedPCS) to address the heterogeneous client sampling problem and improve model performance. ### 1. **Trade - off between Privacy Protection and Model Performance** In the FL system, in order to prevent privacy leakage, clients usually insert noise before uploading local parameters, which will lead to a decline in FL model performance. Therefore, the goal of the paper is to design a client sampling method that can minimize the loss of model performance while protecting privacy. ### 2. **Heterogeneous Client Sampling Problem** Since the clients participating in the training have different data distributions and computing capabilities, the traditional random sampling method may lead to model bias and slow convergence speed. In addition, different clients have different privacy requirements, which further complicates the sampling strategy. ### 3. **Incentive Mechanism Design** In order to encourage clients to actively participate in the training, the paper also designs an incentive mechanism to balance the privacy requirements and contributions of clients through rewards. Each client can dynamically adjust its privacy budget according to the assigned rewards to maximize its own utility. ### 4. **Stackelberg Game Model** The paper models the interaction between the central server and clients as a two - stage Stackelberg game: - **First stage**: The central server designs the optimal time - dependent rewards to minimize the loss of model accuracy and the cost of assigned rewards. - **Second stage**: Each client dynamically adjusts its privacy budget according to the assigned rewards to maximize its own utility. ### 5. **Application of Mean - Field Estimator** In order to overcome the problem that clients cannot share privacy information with each other, the paper introduces the mean - field estimator to estimate the average privacy budget. Through this method, clients can derive the optimal correction factor without exchanging privacy information. ### 6. **Efficiency Analysis** Through the Price of Anarchy (PoA) analysis, the paper proves that the random sampling strategy may lead to efficiency loss, and the proposed privacy - aware sampling strategy can effectively reduce PoA and limit it within an achievable constant range. ### 7. **Adaptive Sampling under Dynamic Privacy Constraints** Considering that the privacy protection requirements may change in different training stages, the paper further extends the model and proposes a method to adaptively adjust the sampling proportion under dynamic privacy constraints. ### Summary The core problem of the paper is to design a client sampling strategy that can protect privacy and improve model performance, and provides theoretical support and experimental proof through game theory and mean - field theory.

A Game-Theoretic Framework for Privacy-Aware Client Sampling in Federated Learning

Projected Federated Averaging with Heterogeneous Differential Privacy.

Adaptive Client Sampling in Federated Learning via Online Learning with Bandit Feedback

FedSampling: A Better Sampling Strategy for Federated Learning

Fed-CBS: A Heterogeneity-Aware Client Sampling Mechanism for Federated Learning via Class-Imbalance Reduction

A Game-theoretic Framework for Privacy-preserving Federated Learning

Performance Analysis and Optimization in Privacy-Preserving Federated Learning

QI-DPFL: Quality-Aware and Incentive-Boosted Federated Learning with Differential Privacy

The Power of Bias: Optimizing Client Selection in Federated Learning with Heterogeneous Differential Privacy

Collaboration in Federated Learning With Differential Privacy: A Stackelberg Game Analysis

Fairness-Aware Client Selection for Federated Learning

Joint Client-and-Sample Selection for Federated Learning Via Bi-level Optimization

Probably Approximately Correct Federated Learning

A Robust Game-theoretical Federated Learning Framework with Joint Differential Privacy

LEFL: Low Entropy Client Sampling in Federated Learning

CRS-FL: Conditional Random Sampling for Communication-Efficient and Privacy-Preserving Federated Learning

Federated Learning with Personalized Differential Privacy Combining Client Selection

FLAS: Computation and Communication Efficient Federated Learning via Adaptive Sampling

Adaptive Heterogeneous Client Sampling for Federated Learning over Wireless Networks

Atherosclerotic disease in axial spondyloarthritis: increased frequency of carotid plaques.