Heterogeneous Federated Learning: State-of-the-art and Research Challenges

Mang Ye,Xiuwen Fang,Bo Du,Pong C. Yuen,Dacheng Tao
2023-09-08
Abstract:Federated learning (FL) has drawn increasing attention owing to its potential use in large-scale industrial applications. Existing federated learning works mainly focus on model homogeneous settings. However, practical federated learning typically faces the heterogeneity of data distributions, model architectures, network environments, and hardware devices among participant clients. Heterogeneous Federated Learning (HFL) is much more challenging, and corresponding solutions are diverse and complex. Therefore, a systematic survey on this topic about the research challenges and state-of-the-art is essential. In this survey, we firstly summarize the various research challenges in HFL from five aspects: statistical heterogeneity, model heterogeneity, communication heterogeneity, device heterogeneity, and additional challenges. In addition, recent advances in HFL are reviewed and a new taxonomy of existing HFL methods is proposed with an in-depth analysis of their pros and cons. We classify existing methods from three different levels according to the HFL procedure: data-level, model-level, and server-level. Finally, several critical and promising future research directions in HFL are discussed, which may facilitate further developments in this field. A periodically updated collection on HFL is available at <a class="link-external link-https" href="https://github.com/marswhu/HFL_Survey" rel="external noopener nofollow">this https URL</a>.
Machine Learning,Artificial Intelligence,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper attempts to address multiple challenges in Heterogeneous Federated Learning (HFL). Specifically, the paper focuses on the following five challenges: 1. **Statistical Heterogeneity**: - Inconsistent data distribution, meaning the data from different participants may not be independent and identically distributed (Non-IID). This heterogeneity may cause local models to converge in different directions, preventing the achievement of a global optimal solution. - Specifically includes label skew, feature skew, quality skew, and quantity skew. 2. **Model Heterogeneity**: - Different clients may have different tasks and specific needs, so each client may design different local models. This leads to obstacles in knowledge transfer, and traditional model aggregation or gradient operations cannot be directly applied. - Model heterogeneity is divided into partial heterogeneity and complete heterogeneity. 3. **Communication Heterogeneity**: - Clients may be deployed in different network environments, leading to inconsistent and asynchronous communication. This affects learning efficiency, especially when the number of participating clients is large, limiting the application in large-scale industrial scenarios. 4. **Device Heterogeneity**: - The devices of different participants may differ in storage and computing capabilities, which may cause some participating nodes to fail or become inactive. - Device heterogeneity leads to low communication efficiency, affecting overall performance. 5. **Additional Challenges**: - Knowledge transfer obstacles: The difficulty of effective learning between different clients. - Privacy leakage: Sensitive information from local data sources may be exposed to other participants. The paper systematically summarizes these challenges and proposes new classification methods, aiming to provide researchers and practitioners with a comprehensive perspective to better understand and address the complex issues in heterogeneous federated learning. Additionally, the paper discusses the current state-of-the-art methods and their pros and cons, and looks forward to future research directions.