SpreadFGL: Edge-Client Collaborative Federated Graph Learning with Adaptive Neighbor Generation

Luying Zhong,Yueyang Pi,Zheyi Chen,Zhengxin Yu,Wang Miao,Xing Chen,Geyong Min
2024-07-14
Abstract:Federated Graph Learning (FGL) has garnered widespread attention by enabling collaborative training on multiple clients for semi-supervised classification tasks. However, most existing FGL studies do not well consider the missing inter-client topology information in real-world scenarios, causing insufficient feature aggregation of multi-hop neighbor clients during model training. Moreover, the classic FGL commonly adopts the FedAvg but neglects the high training costs when the number of clients expands, resulting in the overload of a single edge server. To address these important challenges, we propose a novel FGL framework, named SpreadFGL, to promote the information flow in edge-client collaboration and extract more generalized potential relationships between clients. In SpreadFGL, an adaptive graph imputation generator incorporated with a versatile assessor is first designed to exploit the potential links between subgraphs, without sharing raw data. Next, a new negative sampling mechanism is developed to make SpreadFGL concentrate on more refined information in downstream tasks. To facilitate load balancing at the edge layer, SpreadFGL follows a distributed training manner that enables fast model convergence. Using real-world testbed and benchmark graph datasets, extensive experiments demonstrate the effectiveness of the proposed SpreadFGL. The results show that SpreadFGL achieves higher accuracy and faster convergence against state-of-the-art algorithms.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve The paper primarily addresses two key issues in Federated Graph Learning (FGL): 1. **Missing Cross-Client Topology Information**: - Existing FGL methods ignore potential link information between clients during training, leading to insufficient multi-hop neighbor feature aggregation, which affects the performance of classification tasks. 2. **Single Point Overload Issue**: - When the number of clients increases, the classic FedAvg algorithm results in excessively high training costs for a single edge server, causing severe single point overload problems. To solve the above issues, the authors propose a new framework named SpreadFGL, with its main contributions including: - Proposing an improved centralized FGL framework, FedGL, which explores potential cross-subgraph links between clients through global information flow. - Designing an adaptive graph interpolation generator and a negative sampling mechanism to better extract refined information needed for downstream tasks. - Introducing a distributed FGL framework, SpreadFGL, which achieves better load balancing and faster model convergence by collaborating multiple edge servers for model training.