Relaxed Contrastive Learning for Federated Learning

Seonguk Seo,Jinkyu Kim,Geeho Kim,Bohyung Han
DOI: https://doi.org/10.48550/arXiv.2401.04928
IF: 5.414
2024-01-10
Machine Learning
Abstract:We propose a novel contrastive learning framework to effectively address the challenges of data heterogeneity in federated learning. We first analyze the inconsistency of gradient updates across clients during local training and establish its dependence on the distribution of feature representations, leading to the derivation of the supervised contrastive learning (SCL) objective to mitigate local deviations. In addition, we show that a na\"ive adoption of SCL in federated learning leads to representation collapse, resulting in slow convergence and limited performance gains. To address this issue, we introduce a relaxed contrastive learning loss that imposes a divergence penalty on excessively similar sample pairs within each class. This strategy prevents collapsed representations and enhances feature transferability, facilitating collaborative training and leading to significant performance improvements. Our framework outperforms all existing federated learning approaches by huge margins on the standard benchmarks through extensive experimental results.
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper aims to address the issue of data heterogeneity in Federated Learning (FL). Specifically, the paper focuses on the following two main challenges: 1. **Data Heterogeneity and Class Imbalance**: In Federated Learning, the data distribution across different clients can be very different, and the data classes on each client may be severely imbalanced. These issues lead to significant biases among the local optima of local models, hindering the global model from finding a better global optimum and slowing down the convergence speed of the model. 2. **Feature Collapse in Supervised Contrastive Learning (SCL)**: Although SCL helps alleviate optimization issues in Federated Learning, directly applying SCL to Federated Learning leads to feature collapse, where the features of samples from the same class become overly similar. This weakens the model's generalization ability, resulting in slow convergence and limited performance improvement. ### Solution To address the above challenges, the paper proposes a new contrastive learning framework—**Relaxed Contrastive Learning for Federated Learning (FedRCL)**. The specific solutions include: 1. **Theoretical Analysis**: The paper first re-derives the bounds of gradient update bias and proves that Supervised Contrastive Learning (SCL) can alleviate this bias, thereby achieving consistent local updates across heterogeneous clients. 2. **Identifying the Feature Collapse Problem**: Through experimental validation, the paper finds that SCL in Federated Learning leads to feature collapse, where the features of samples from the same class become overly similar, affecting the model's generalization ability and cross-task transferability. 3. **Introducing Relaxed Contrastive Loss**: To solve the feature collapse problem, the paper proposes a Relaxed Supervised Contrastive Loss (RCL). This loss function imposes a divergence penalty term between pairs of samples from the same class to prevent features from becoming overly similar, thereby enhancing feature diversity and transferability. 4. **Multi-level Contrastive Training**: To further improve the effectiveness of model aggregation, the paper extends the contrastive learning method to cover the early layers of the model architecture. This helps achieve consistent local updates at all levels, especially in the early layers. ### Experimental Results The paper extensively validates the effectiveness of FedRCL through experiments. The experimental results show that FedRCL significantly outperforms existing Federated Learning algorithms on multiple standard benchmark datasets, especially in scenarios with high data heterogeneity. The specific improvements are as follows: - **Faster Convergence Speed**: FedRCL can quickly converge in the early stages, demonstrating faster training speed. - **Higher Performance Improvement**: FedRCL significantly improves model performance across multiple datasets and settings. - **Stronger Robustness**: FedRCL maintains good performance even at extremely low participation rates. In summary, the paper effectively addresses the issues of data heterogeneity and feature collapse in Federated Learning by introducing a relaxed contrastive learning framework, significantly enhancing the model's performance and robustness.