Abstract:Real-time machine learning (ML) has recently attracted significant interest due to its potential to support instantaneous learning, adaptation, and decision making in a wide range of application domains, including self-driving vehicles, intelligent transportation, and industry automation. In this paper, we investigate real-time ML in a federated edge intelligence (FEI) system, an edge computing system that implements federated learning (FL) solutions based on data samples collected and uploaded from decentralized data networks, e.g., Internet-of-Things (IoT) and/or wireless sensor networks. FEI systems often exhibit heterogenous communication and computational resource distribution, as well as non-i.i.d. data samples arrived at different edge servers, resulting in long model training time and inefficient resource utilization. Motivated by this fact, we propose a time-sensitive federated learning (TS-FL) framework to minimize the overall run-time for collaboratively training a shared ML model with desirable accuracy. Training acceleration solutions for both TS-FL with synchronous coordination (TS-FL-SC) and asynchronous coordination (TS-FL-ASC) are investigated. To address the straggler effect in TS-FL-SC, we develop an analytical solution to characterize the impact of selecting different subsets of edge servers on the overall model training time. A server dropping-based solution is proposed to allow some slow-performance edge servers to be removed from participating in the model training if their impact on the resulting model accuracy is limited. A joint optimization algorithm is proposed to minimize the overall time consumption of model training by selecting participating edge servers, the local epoch number (the number of model training iterations per coordination), and the data batch size (the number of data samples for each model training iteration). Motivated by the fact that data samples at the slowest edge server may exhibit special characteristics that cannot be removed from model training, we develop an analytical expression to characterize the impact of both staleness effect of asynchronous coordination and straggler effect of FL on the time consumption of TS-FL-ASC. We propose a load forwarding-based solution that allows a slow edge server to offload part of its training samples to trusted edge servers with higher processing capability. We develop a hardware prototype to evaluate the model training time of a heterogeneous FEI system. Experimental results show that our proposed TS-FL-SC and TS-FL-ASC can provide up to 63% and 28% of reduction, in the overall model training time, respectively, compared with traditional FL solutions.

Efficient Knowledge Management for Heterogeneous Federated Continual Learning on Resource-Constrained Edge Devices

Efficient knowledge management for heterogenous federated continual learning on resource-constrained edge devices

Resource-Efficient Heterogenous Federated Continual Learning on Edge

Edge-cloud Collaborative Learning with Federated and Centralized Features

Cross-FCL: Toward a Cross-edge Federated Continual Learning Framework in Mobile Edge Computing Systems

Towards Efficient Asynchronous Federated Learning in Heterogeneous Edge Environments

Multi-granularity Weighted Federated Learning in Heterogeneous Mobile Edge Computing Systems

Age-Aware Data Selection and Aggregator Placement for Timely Federated Continual Learning in Mobile Edge Computing

AnycostFL: Efficient On-Demand Federated Learning over Heterogeneous Edge Devices

Accurate Forgetting for Heterogeneous Federated Continual Learning

Learn from Others and Be Yourself in Heterogeneous Federated Learning

Energy-Efficient and Reliable Federated Learning in Heterogeneous Mobile-Edge Computing.

An Efficient Asynchronous Federated Learning Protocol for Edge Devices

Federated Learning with Dynamic Epoch Adjustment and Collaborative Training in Mobile Edge Computing

Federated Continual Learning via Knowledge Fusion: A Survey

Towards Long-Term Remembering in Federated Continual Learning

Time-sensitive Learning for Heterogeneous Federated Edge Intelligence

Importance-aware Data Selection and Resource Allocation for Hierarchical Federated Edge Learning

Multicenter Hierarchical Federated Learning with Fault-Tolerance Mechanisms for Resilient Edge Computing Networks

Energy-Aware Edge Association for Cluster-based Personalized Federated Learning

Resilient and Communication Efficient Learning for Heterogeneous Federated Systems