Abstract:As a privacy-preserving paradigm for training Machine Learning (ML) models, Federated Learning (FL) has received tremendous attention from both industry and academia. In a typical FL scenario, clients exhibit significant heterogeneity in terms of data distribution and hardware configurations. Thus, randomly sampling clients in each training round may not fully exploit the local updates from heterogeneous clients, resulting in lower model accuracy, slower convergence rate, degraded fairness, etc. To tackle the FL client heterogeneity problem, various client selection algorithms have been developed, showing promising performance improvement. In this paper, we systematically present recent advances in the emerging field of FL client selection and its challenges and research opportunities. We hope to facilitate practitioners in choosing the most suitable client selection mechanisms for their applications, as well as inspire researchers and newcomers to better understand this exciting research topic.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the client heterogeneity problem in Federated Learning (FL). Specifically: 1. **Client Heterogeneity**: In a typical Federated Learning scenario, clients exhibit significant heterogeneity in data distribution and hardware configuration. This heterogeneity can lead to the inability to fully utilize local updates from different clients when randomly selecting clients for training, resulting in problems such as reduced model accuracy, slower convergence, and compromised fairness. 2. **Limitations of Existing Methods**: Traditional random sampling methods do not fully consider the differences between clients and thus perform poorly in practical applications. To overcome these problems, researchers have developed various client selection algorithms to improve model performance. 3. **Research Objectives**: This paper aims to systematically summarize the latest progress in the field of Federated Learning client selection in recent years and discuss the challenges it faces and future research opportunities. In this way, the author hopes to help practitioners select the client selection mechanism that is most suitable for their application scenarios and provide researchers and newcomers with a guide to gain in - depth understanding of this emerging research field. ### Specific Problem Description - **System Heterogeneity**: There are differences in hardware configurations such as computing power, communication capabilities, and energy consumption among different clients. For example, the computing power of mobile devices may differ by dozens of times, and network bandwidth may also have an order - of - magnitude difference. - **Statistical Heterogeneity**: The data distribution of clients is uneven and non - independent and identically distributed (Non - IID). For example, some clients may have a large amount of data, while other clients have less data; in addition, the data distributions of different clients may be completely different. ### Solutions To address the above problems, the paper proposes the following solutions: - **Client Selection Algorithms**: By designing effective client selection algorithms, prioritize the selection of clients that are most helpful for global model updates. These algorithms can evaluate the priority of each client based on statistical utility (such as the number of data samples, loss function values, etc.) and system utility (such as computing time and communication delay, etc.). - **Optimization Strategies**: Adopt optimization strategies to balance exploration (selecting more diverse clients) and exploitation (selecting high - priority clients) to avoid performance degradation due to long - term neglect of certain clients. ### Paper Structure The paper explores the Federated Learning client selection problem in detail through the following aspects: 1. **Literature Review**: Introduces the search and evaluation process of existing literature to ensure that the most representative research results are covered. 2. **Client Heterogeneity Analysis**: Discusses in detail the impact of system heterogeneity and statistical heterogeneity on Federated Learning. 3. **Priority Evaluation**: Introduces how to measure and select client priorities, including specific formulas for statistical utility and system utility. 4. **Implementation Practice**: Summarizes the main frameworks and tools currently used for client selection. 5. **Challenges and Opportunities**: Points out the main challenges in this field and future research directions. Through these contents, the paper provides readers with a comprehensive and in - depth understanding, helping them better select and design Federated Learning client selection algorithms in practical applications.

Client Selection in Federated Learning: Principles, Challenges, and Opportunities

Client Selection in Federated Learning: Principles, Challenges, and Opportunities

Fairness-Aware Client Selection for Federated Learning

A Snapshot of the Frontiers of Client Selection in Federated Learning

A Review of Client Selection Methods in Federated Learning

Stochastic Client Selection for Federated Learning with Volatile Clients

A review on client selection models in federated learning

Welfare and Fairness Dynamics in Federated Learning: A Client Selection Perspective

Federated Select: A Primitive for Communication- and Memory-Efficient Federated Learning

[On autopurification phenomena in sea water. II. Survival and variability in the fecal flora].

Online Client Selection for Asynchronous Federated Learning With Fairness Consideration

Submodular Maximization Approaches for Equitable Client Selection in Federated Learning

Client Selection for Federated Learning with Heterogeneous Resources in Mobile Edge

Addressing Heterogeneity in Federated Learning with Client Selection via Submodular Optimization

Sample-level Data Selection for Federated Learning.

DPP-based Client Selection for Federated Learning with Non-IID Data

Towards Federated Learning on Fresh Datasets

Smart client selection strategies for enhanced federated learning in digital healthcare applications

Data Quality-Aware Client Selection in Heterogeneous Federated Learning

The Power of Bias: Optimizing Client Selection in Federated Learning with Heterogeneous Differential Privacy

Reputation-Aware Federated Learning Client Selection based on Stochastic Integer Programming