FedBone: Towards Large-Scale Federated Multi-Task Learning

Yi-Qiang Chen,Teng Zhang,Xin-Long Jiang,Qian Chen,Chen-Long Gao,Wu-Liang Huang
DOI: https://doi.org/10.1007/s11390-024-3639-x
IF: 1.871
2024-12-08
Journal of Computer Science and Technology
Abstract:Federated multi-task learning (FMTL) has emerged as a promising framework for learning multiple tasks simultaneously with client-aware personalized models. While the majority of studies have focused on dealing with the non-independent and identically distributed (Non-IID) characteristics of client datasets, the issue of task heterogeneity has largely been overlooked. Dealing with task heterogeneity often requires complex models, making it impractical for federated learning in resource-constrained environments. In addition, the varying nature of these heterogeneous tasks introduces inductive biases, leading to interference during aggregation and potentially resulting in biased global models. To address these issues, we propose a hierarchical FMTL framework, referred to as FedBone, to facilitate the construction of large-scale models with improved generalization. FedBone leverages server-client split learning and gradient projection to split the entire model into two components: 1) a large-scale general model (referred to as the general model) on the cloud server, and 2) multiple task-specific models (referred to as client models) on edge clients, accommodating devices with limited compute power. To enhance the robustness of the large-scale general model, we incorporate the conflicting gradient projection technique into FedBone to rectify the skewed gradient direction caused by aggregating gradients from heterogeneous tasks. The proposed FedBone framework is evaluated on three benchmark datasets and one real ophthalmic dataset. The comprehensive experiments demonstrate that FedBone efficiently adapts to the heterogeneous local tasks of each client and outperforms existing federated learning algorithms in various dense prediction and classification tasks while utilizing off-the-shelf computational resources on the client side.
computer science, software engineering, hardware & architecture
What problem does this paper attempt to address?