MHAT: an Efficient Model-Heterogenous Aggregation Training Scheme for Federated Learning.

Li Hu,Hongyang Yan,Lang Li,Zijie Pan,Xiaozhang Liu,Zulong Zhang
DOI: https://doi.org/10.1016/j.ins.2021.01.046
IF: 8.1
2021-01-01
Information Sciences
Abstract:Federated Learning allows multiple participants to jointly train a global model while guaranteeing the confidentiality and integrity of private datasets. However, current server aggregation algorithms for federated learning only focus on model parameters, resulting in heavy communication costs and low convergence speed. Most importantly, they are unable to handle the scenario wherein different clients hold different local models with various network architectures. In this paper, we view these challenges from an alternative perspective: we draw attention to what should be aggregated and how to improve convergence efficiency. Specifically, we propose MHAT, a novel model-heterogenous aggregation training federated learning scheme which exploits a technique of Knowledge Distillation (KD) to extract the update information of the heterogenous model of all clients and trains an auxiliary model on the server to realize information aggregation. MHAT relaxes clients from fixing on an unified model architecture and significantly reduces the required computing resources while maintaining acceptable model convergence accuracy. Various experiments verify the effectiveness and applicability of our proposed scheme. (c) 2021 Elsevier Inc. All rights reserved. Federated learning 11?3] is an emerging machine learning paradigm for decentralized data 14,5], which enables multiple parties to collaboratively train a global model without sharing their private data. In the canonical federated learning protocol 16], model parameter is the interactive information between clients and the server. At the beginning of each training round, the central server distributes current global model to all participants, and all participants update the local model using their private data. Then the server will collect updated model information from all parties and obtain the central global model via a weighted average aggregation algorithm. However, in a practical federated learning, several major challenges still remain unsolved when using model parameters
What problem does this paper attempt to address?