Abstract:Federated learning has emerged as a distributed learning paradigm by training at each client and aggregating at a parameter server. System heterogeneity hinders stragglers from responding to the server in time with huge communication costs. Although client grouping in federated learning can solve the straggler problem, the stochastic selection strategy in client grouping neglects the impact of data distribution within each group. Besides, current client grouping approaches make clients suffer unfair participation, leading to biased performances for different clients. In order to guarantee the fairness of client participation and mitigate biased local performances, we propose a federated dynamic client selection method based on data representativity (FedSDR). FedSDR clusters clients into groups correlated with their own local computational efficiency. To estimate the significance of client datasets, we design a novel data representativity evaluation scheme based on local data distribution. Furthermore, the two most representative clients in each group are selected to optimize the global model. Finally, the DYNAMIC-SELECT algorithm updates local computational efficiency and data representativity states to regroup clients after periodic average aggregation. Evaluations on real datasets show that FedSDR improves client participation by 27.4%, 37.9%, and 23.3% compared with FedAvg, TiFL, and FedSS, respectively, taking fairness into account in federated learning. In addition, FedSDR surpasses FedAvg, FedGS, and FedMS by 21.32%, 20.4%, and 6.90%, respectively, in local test accuracy variance, balancing the performance bias of the global model across clients.

What problem does this paper attempt to address?

The paper attempts to address the issue of ensuring fairness in heterogeneous edge computing environments through a federated dynamic client selection method. Specifically, the paper focuses on the following two main issues: 1. **Unfairness in client participation**: Existing federated learning methods often overlook the differences in computational capabilities and data distribution among clients when selecting clients for training. This leads to some clients (especially those with weaker computational capabilities or uneven data distribution) being unable to participate fairly in model training. This unfair participation can result in bias in the global model performance. 2. **Bias in global model performance**: Due to the inconsistent data distribution among clients, existing client grouping methods cannot accurately reflect the data representativeness of each client, leading to significant performance differences of the global model across different clients. This performance bias affects the overall performance and fairness of the global model. To address these issues, the paper proposes a federated dynamic client selection method based on data representativeness (FedSDR). This method is implemented through the following steps: 1. **Constructing a polynomial distribution set**: Clients are assigned to different groups based on their local computational efficiency. Clients within each group have similar computational efficiency. 2. **Evaluating data representativeness**: A new data representativeness evaluation scheme is designed to assess the importance of each client's data based on the similarity between local data distribution and global data distribution. 3. **Selecting representative clients**: The most representative clients in terms of data are selected from each group to participate in training, ensuring that each client has a fair chance to participate in training. 4. **Dynamically adjusting client selection**: Through the DYNAMIC-SELECT algorithm, the computational efficiency and selection weight of clients are updated after each iteration, dynamically adjusting the set of clients participating in training. Through these steps, FedSDR aims to increase client participation and balance the performance bias of the global model across different clients, thereby achieving fairer federated learning. Experimental results show that FedSDR significantly improves client participation across multiple heterogeneous datasets and outperforms other methods in terms of variance in local test accuracy.

Federated Dynamic Client Selection for Fairness Guarantee in Heterogeneous Edge Computing

An EMD-Based Adaptive Client Selection Algorithm for Federated Learning in Heterogeneous Data Scenarios

Federated learning with workload-aware client scheduling in heterogeneous systems

Fairness and Accuracy in Federated Learning

FedSC: A federated learning algorithm based on client-side clustering

[On autopurification phenomena in sea water. II. Survival and variability in the fecal flora].

Fairness-Aware Client Selection for Federated Learning

FED-SDS: Adaptive Structured Dynamic Sparsity for Federated Learning under Heterogeneous Clients

Data Quality-Aware Client Selection in Heterogeneous Federated Learning

FedUB: Federated Learning Algorithm Based on Update Bias

FedSS: Federated Learning with Smart Selection of clients

Tackling Mavericks in Federated Learning via Adaptive Client Selection Strategy

FedDCS: Federated Learning Framework Based on Dynamic Client Selection

CDFed: Contribution-based Dynamic Federated Learning for Managing System and Statistical Heterogeneity

FedFair^3: Unlocking Threefold Fairness in Federated Learning

DPP-based Client Selection for Federated Learning with Non-IID Data

An Online Learning Approach for Client Selection in Federated Edge Learning under Budget Constraint

APCSMA: Adaptive Personalized Client-Selection and Model-Aggregation Algorithm for Federated Learning in Edge Computing Scenarios

Scout: an Efficient Federated Learning Client Selection Algorithm Driven by Heterogeneous Data and Resource

Addressing Heterogeneity in Federated Learning with Client Selection via Submodular Optimization

Adaptive Clustered Federated Learning with Representation Similarity