DualFed: Enjoying both Generalization and Personalization in Federated Learning via Hierachical Representations

Guogang Zhu,Xuefeng Liu,Jianwei Niu,Shaojie Tang,Xinghao Wu,Jiayuan Zhang
DOI: https://doi.org/10.1145/3664647.3681260
2024-07-25
Abstract:In personalized federated learning (PFL), it is widely recognized that achieving both high model generalization and effective personalization poses a significant challenge due to their conflicting nature. As a result, existing PFL methods can only manage a trade-off between these two objectives. This raises an interesting question: Is it feasible to develop a model capable of achieving both objectives simultaneously? Our paper presents an affirmative answer, and the key lies in the observation that deep models inherently exhibit hierarchical architectures, which produce representations with various levels of generalization and personalization at different stages. A straightforward approach stemming from this observation is to select multiple representations from these layers and combine them to concurrently achieve generalization and personalization. However, the number of candidate representations is commonly huge, which makes this method infeasible due to high computational <a class="link-external link-http" href="http://costs.To" rel="external noopener nofollow">this http URL</a> address this problem, we propose DualFed, a new method that can directly yield dual representations correspond to generalization and personalization respectively, thereby simplifying the optimization task. Specifically, DualFed inserts a personalized projection network between the encoder and classifier. The pre-projection representations are able to capture generalized information shareable across clients, and the post-projection representations are effective to capture task-specific information on local clients. This design minimizes the mutual interference between generalization and personalization, thereby achieving a win-win situation. Extensive experiments show that DualFed can outperform other FL methods. Code is available at <a class="link-external link-https" href="https://github.com/GuogangZhu/DualFed" rel="external noopener nofollow">this https URL</a>.
Machine Learning,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve This paper aims to address a core challenge in Personalized Federated Learning (PFL): how to achieve both high generalization and effective personalization within the same model. Specifically, since generalization and personalization are often conflicting goals, existing PFL methods can only trade off between these two objectives. The paper proposes a new method called DualFed, which introduces hierarchical representations to achieve both goals simultaneously. ### Background and Motivation In real-world applications, the data distribution across different clients is often non-independent and identically distributed (Non-IID). For example, in video surveillance, data collected by different cameras can vary significantly due to different weather and lighting conditions. This Non-IID data distribution can lead to a significant decline in the performance of Federated Learning (FL) models. Currently, there are two main approaches to address this issue: improving the model's generalization ability to accommodate more clients, or enhancing the model's personalization ability to better fit the local data distribution. However, since the local data distribution usually differs from the global distribution, these two optimization goals are often conflicting. ### Limitations of Existing Methods Early PFL methods mainly balanced client collaboration and local adaptation by sharing either the classifier or the encoder while personalizing the other part. These methods can only ensure that the encoder generates either generalized or personalized representations, but not both simultaneously. Some PFL methods attempt to personalize specific parameters within the encoder to extract representations that have both generalization and personalization features. Although these methods alleviate the conflict between generalization and personalization to some extent, they still involve trade-offs. ### Innovation of Hierarchical Representations The core innovation of the paper lies in utilizing the hierarchical structure of deep models to extract representations at different levels. Specifically, shallow layers capture general patterns that can transfer across different data distributions, while deeper layers gradually filter out irrelevant components, retaining information crucial for downstream tasks. This indicates that generalized and personalized representations already exist within the model. By leveraging these hidden generalized and personalized representations, it is possible to achieve both high generalization and personalization in PFL. ### DualFed Method To achieve this goal, the paper proposes the DualFed method. This method inserts a personalized projection network between the encoder and the classifier, generating two stages of representations: pre-projection representations and post-projection representations. Pre-projection representations are generated before the projection network and are separated from the local task, making them more transferable; post-projection representations are generated after the projection network and are closer to the decision layer, making them more discriminative and personalized. In this way, DualFed effectively decouples generalized and personalized representations, addressing the issue of pursuing conflicting goals within the same stage of representation. ### Experimental Results The paper validates the effectiveness of DualFed through extensive experiments on multiple datasets. The experimental results show that DualFed outperforms existing FL methods on multiple metrics, achieving both high generalization and personalization simultaneously. ### Conclusion By introducing hierarchical representations and a personalized projection network, the paper successfully optimizes both generalization and personalization in Personalized Federated Learning. This method provides a new approach to addressing the challenges of Federated Learning under Non-IID data distribution.