Abstract:Recently, Hierarchical Federated Learning (HFL) stands out as a cutting-edge approach to efficiently learn knowledge from massive data on edge devices or clients. To alleviate the computation/communication burden of training large-scale models on resource-constrained clients, Hierarchical Split Federated Learning (HSFL), which splits an entire model into a top and bottom (sub-)model and offloads the training of the top model to the edge server, has been proposed. Nonetheless, there are two key issues, i.e., system heterogeneity and dynamic contexts, hindering the application of effective HSFL. In response, we present an efficient HSFL method, termed AdaHSFL, which introduces intra-cluster gradient feedback regulation and inter-cluster updating frequencies optimization to enhance training efficiency. Concretely, intra-cluster gradient feedback regulation enables edge servers to instantly process incoming smashed data and calculate gradients using top model copies on background threads, while inter-cluster updating frequencies optimization adjusts edge cluster updating frequencies to align the training duration across all clusters with that of the fastest cluster. Furthermore, AdaHSFL explores to jointly implement these two strategies to eliminate the idle waiting time both intra- and intercluster incurred by synchronization barriers. Rigorous performance evaluation demonstrates that AdaHSFL can improve the accuracy by $3.9 \%-25.5 \%$ within given time budgets, compared with the baselines.

Accelerating Hierarchical Federated Learning with Model Splitting in Edge Computing