Improving Model Consistency of Decentralized Federated Learning via Sharpness Aware Minimization and Multiple Gossip Approaches

Yifan Shi,Li Shen,Kang Wei,Yan Sun,Bo Yuan,Xueqian Wang,Dacheng Tao
2023-01-01
Abstract:To mitigate the privacy leakages and reduce the communication burden of Federated Learning (FL), decentralized FL (DFL) discards the central server and each client only communicates with its neighbors in the decentralized communication network. However, existing DFL algorithms tend to feature high inconsistency among local models, which results in severe distribution shifts across clients and inferior performance compared with centralized FL (CFL), especially on heterogeneous data or with sparse connectivity of communication topology. To alleviate this challenge, we propose two DFL algorithms named DFedSAM and DFedSAM-MGS to improve the performance. Specifically, DFedSAM leverages gradient perturbation to generate local flatness models via Sharpness Aware Minimization (SAM), which searches for model parameters with uniformly low loss function values. In addition, DFedSAM-MGS further boosts DFedSAM by adopting the technique of Multiple Gossip Steps (MGS) for a better model consistency, which accelerates the aggregation of local flatness models and better balances the communication complexity and learning performance. In the theoretical perspective, we present the improved convergence rates $\small \mathcal{O}\big(\frac{1}{T}+\frac{1}{T^2(1-\lambda)^2}\big)$ and $\small \mathcal{O}\big(\frac{1}{T}+\frac{\lambda^Q+1}{T^2(1-\lambda^Q)^2}\big)$ in the stochastic non-convex setting for DFedSAM and DFedSAM-MGS, respectively, where $1-\lambda$ is the spectral gap of the gossip matrix $W$ and $Q$ is the gossip steps in MGS. Meanwhile, we empirically confirm that our methods can achieve competitive performance compared with CFL baselines and outperform existing DFL baselines.
What problem does this paper attempt to address?