Abstract:Federated Learning (FL) heavily depends on label quality for its performance. However, the label distribution among individual clients is always both noisy and heterogeneous. The high loss incurred by client-specific samples in heterogeneous label noise poses challenges for distinguishing between client-specific and noisy label samples, impacting the effectiveness of existing label noise learning approaches. To tackle this issue, we propose FedFixer, where the personalized model is introduced to cooperate with the global model to effectively select clean client-specific samples. In the dual models, updating the personalized model solely at a local level can lead to overfitting on noisy data due to limited samples, consequently affecting both the local and global models' performance. To mitigate overfitting, we address this concern from two perspectives. Firstly, we employ a confidence regularizer to alleviate the impact of unconfident predictions caused by label noise. Secondly, a distance regularizer is implemented to constrain the disparity between the personalized and global models. We validate the effectiveness of FedFixer through extensive experiments on benchmark datasets. The results demonstrate that FedFixer can perform well in filtering noisy label samples on different clients, especially in highly heterogeneous label noise scenarios.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the model performance degradation in Federated Learning (FL) due to the heterogeneity of label noise. Specifically: 1. **Heterogeneity of label noise**: In federated learning, the data labels of each client may contain noise, and the distribution of these noises varies from client to client. This heterogeneity makes it difficult to distinguish between client - specific samples and noisy - label samples, thus affecting the effectiveness of existing label - noise learning methods. 2. **Over - fitting problem**: Due to the limited amount of client data, personalized models are prone to over - fit to noisy - label data, which in turn affects the performance of the global model. To solve the above problems, the authors propose the FedFixer method. Its main goal is to effectively select clean client - specific samples by introducing a dual - model structure (combining the global model and the personalized model), and to mitigate the impact of over - fitting through two regularizers. The specific contributions are as follows: - **Dual - model structure**: A dual - model structure is designed, which can adapt to the heterogeneity of label noise among different clients. - **Regularizers**: - **Confidence Regularizer**: Used to reduce uncertain predictions caused by label noise. - **Distance Regularizer**: Constrains the difference between the personalized model and the global model to prevent the personalized model from over - fitting local noisy data. - **Experimental verification**: Through extensive experiments on multiple benchmark datasets, the effectiveness of FedFixer in dealing with the heterogeneity of label noise has been verified. Especially in highly heterogeneous label - noise scenarios, it performs better than existing methods. ### Formula summary - **Loss function**: \[ F_k(w)=\frac{1}{\bar{n}_k}\sum_{n\in [n_k]}v_n\cdot\ell(x_n,\tilde{y}_n;\theta_k)+\frac{\lambda}{2}\|\theta_k - w\|^2 \] where \(v_n\in\{0, 1\}\) indicates whether sample \(n\) is a clean sample, and \(\ell(\cdot)\) is the loss function. - **Confidence Regularizer**: \[ \ell_{CR}(f(x_n)) := -\beta\cdot E_{eY|f_D}[\ell_{CE}(f(x_n),eY)] \] where \(\beta\geq0\) is a hyper - parameter, and \(P(eY|eD)\) is the prior probability determined based on the noisy dataset. - **Distance Regularizer**: \[ \frac{\lambda}{2}\|\theta_k - w\|^2 \] where \(\theta_k\) is the personalized model of client \(k\), and \(\lambda\in(0,+\infty)\) is a regularization parameter. Through these methods, FedFixer can effectively deal with the heterogeneity problem of label noise in the federated learning environment and improve the generalization performance of the model.

FedFixer: Mitigating Heterogeneous Label Noise in Federated Learning

FedDGP: Disentangling Global and Personal Models for Federated Learning

Tackling Noisy Clients in Federated Learning with End-to-end Label Correction

Federated Data Quality Assessment Approach: Robust Learning With Mixed Label Noise

Federated Learning with Label Distribution Skew via Logits Calibration.

FedDiv: Collaborative Noise Filtering for Federated Learning with Noisy Labels

Learning Cautiously in Federated Learning with Noisy and Heterogeneous Clients

Federated Learning with Label-Masking Distillation

FedPer++: Toward Improved Personalized Federated Learning on Heterogeneous and Imbalanced Data

MpFedcon : Model-Contrastive Personalized Federated Learning with the Class Center

Learning Locally, Revising Globally: Global Reviser for Federated Learning with Noisy Labels

Federated Learning with Extremely Noisy Clients via Negative Distillation

Personalized federated learning based on feature fusion

Federated Learning with Instance-Dependent Noisy Label

FedDistill: Global Model Distillation for Local Model De-Biasing in Non-IID Federated Learning

FedImpro: Measuring and Improving Client Update in Federated Learning

Overhead-free Noise-tolerant Federated Learning: A New Baseline

Medical federated learning with joint graph purification for noisy label learning

Rethinking Client Drift in Federated Learning: A Logit Perspective

FedLF: Adaptive Logit Adjustment and Feature Optimization in Federated Long-Tailed Learning

Data Quality-Aware Client Selection in Heterogeneous Federated Learning