Tackling Mavericks in Federated Learning via Adaptive Client Selection Strategy

Jiyue Huang,Chi Hong,Yang Liu,Lydia Y. Chen,Stefanie Roos
2021-01-01
Abstract:The paradigm of Federated learning (FL) enables collaborative learning across data parties who have different data quantity and distributions. To ensure the fast convergence and high accuracy on such heterogeneous clients, it is imperative to timely select clients who can effectively contribute to learning. A relevant but overlooked case are Maverick clients, who monopolizes the possession of certain data types, e.g., children hospitals possess most of the data on pediatric cardiology. In this paper, we tackle the challenges of Maverick. We explore two types of client selection strategies, based on Shapley Value measurement and distribution distance. We first show — theoretically and through simulations— that Shapley Value underestimates the contribution of Maverick and thus fall shorts in selecting the right clients. We also propose FEDEMD, an adaptive client selection strategy based on the Wasserstein distance between the local and global data distributions, supported by a proven convergence bound. As FEDEMD adapts the selection probability such that Mavericks are preferably selected when the model benefits from improvement on rare classes, it consistently ensures the fast convergence in the presence of different types of Mavericks. Compared to existing strategies, including Shapley Value based ones, FEDEMD improves the convergence of neural network classifiers by 26.9% with FedAvg aggregation and its performance works across various levels of heterogeneity.
What problem does this paper attempt to address?