Enhancing Node Representations for Real-World Complex Networks with Topological Augmentation

Xiangyu Zhao,Zehui Li,Mingzhu Shen,Guy-Bart Stan,Pietro LiĆ²,Yiren Zhao
2024-08-13
Abstract:Graph augmentation methods play a crucial role in improving the performance and enhancing generalisation capabilities in Graph Neural Networks (GNNs). Existing graph augmentation methods mainly perturb the graph structures, and are usually limited to pairwise node relations. These methods cannot fully address the complexities of real-world large-scale networks, which often involve higher-order node relations beyond only being pairwise. Meanwhile, real-world graph datasets are predominantly modelled as simple graphs, due to the scarcity of data that can be used to form higher-order edges. Therefore, reconfiguring the higher-order edges as an integration into graph augmentation strategies lights up a promising research path to address the aforementioned issues. In this paper, we present Topological Augmentation (TopoAug), a novel graph augmentation method that builds a combinatorial complex from the original graph by constructing virtual hyperedges directly from the raw data. TopoAug then produces auxiliary node features by extracting information from the combinatorial complex, which are used for enhancing GNN performances on downstream tasks. We design three diverse virtual hyperedge construction strategies to accompany the construction of combinatorial complexes: (1) via graph statistics, (2) from multiple data perspectives, and (3) utilising multi-modality. Furthermore, to facilitate TopoAug evaluation, we provide 23 novel real-world graph datasets across various domains including social media, biology, and e-commerce. Our empirical study shows that TopoAug consistently and significantly outperforms GNN baselines and other graph augmentation methods, across a variety of application contexts, which clearly indicates that it can effectively incorporate higher-order node relations into the graph augmentation for real-world complex networks.
Machine Learning,Information Retrieval,Social and Information Networks
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to improve the quality of node representations through graph augmentation methods in graph neural networks (GNNs), especially when dealing with real - world complex networks. Existing graph augmentation methods mainly focus on perturbing the graph structure and are usually limited to binary relationships between nodes. These methods cannot fully cope with the complexity in real - large - scale networks because these networks often involve higher - order node relationships, not just binary relationships. In addition, due to the lack of data that can form higher - order edges, most real - world graph datasets are modeled as simple graphs, which limits the application of higher - order graph neural networks (such as hypergraph neural networks). To this end, the paper proposes Topological Augmentation (TopoAug), a new graph augmentation method. It forms a combinatorial complex by directly constructing virtual hyper - edges from the original graph data, thereby capturing more complex relationships between nodes. TopoAug further generates auxiliary node features by extracting information from the combinatorial complex to enhance the performance of GNNs in downstream tasks. The paper designs three different virtual hyper - edge construction strategies: based on graph statistics, from a multi - data perspective, and using multi - modal data. Through experimental evaluations on 23 newly constructed real - world graph datasets, the consistency and significant performance improvement of TopoAug in multiple application scenarios are proved.