Can Large Language Models Act as Ensembler for Multi-GNNs?

Hanqi Duan,Yao Cheng,Jianxiang Yu,Xiang Li
2024-10-22
Abstract:Graph Neural Networks (GNNs) have emerged as powerful models for learning from graph-structured data. However, GNNs lack the inherent semantic understanding capability of rich textual nodesattributes, limiting their effectiveness in applications. On the other hand, we empirically observe that for existing GNN models, no one can consistently outperforms others across diverse datasets. In this paper, we study whether LLMs can act as an ensembler for multi-GNNs and propose the LensGNN model. The model first aligns multiple GNNs, mapping the representations of different GNNs into the same space. Then, through LoRA fine-tuning, it aligns the space between the GNN and the LLM, injecting graph tokens and textual information into LLMs. This allows LensGNN to integrate multiple GNNs and leverage LLM's strengths, resulting in better performance. Experimental results show that LensGNN outperforms existing models. This research advances text-attributed graph ensemble learning by providing a robust, superior solution for integrating semantic and structural information. We provide our code and data here: <a class="link-external link-https" href="https://anonymous.4open.science/r/EnsemGNN-E267/" rel="external noopener nofollow">this https URL</a>.
Artificial Intelligence
What problem does this paper attempt to address?
The problems that this paper attempts to solve mainly focus on two aspects: 1. **Graph Neural Networks (GNNs) lack the inherent semantic understanding ability when dealing with nodes with rich text attributes**: Although GNNs perform well in capturing graph structures, they often fail to fully utilize the semantic information in these texts when dealing with the text attributes of nodes, which limits their effectiveness in certain application scenarios, such as citation networks and social media platforms. 2. **The performance of existing GNN models is inconsistent across different datasets**: No GNN model can outperform other models on all types of datasets. For example, by comparing the node classification performance of four representative GNN models (APPNP, GAT, GCN, and GIN) on three benchmark datasets (Cora, Citeseer, and PubMed), the author found that each model performs differently on different datasets, which makes choosing the optimal GNN model a challenge. To address the above challenges, the author poses a question: **Can large - language models (LLMs) serve as integrators for multiple GNN models?** To this end, they designed a model named LensGNN, which aims to improve the processing effect on text - attributed graphs by integrating the capabilities of multiple GNN models and LLMs. Specifically, LensGNN first aligns multiple GNN models by mapping the representations of different GNN models to the same space, and then injects the graph - structure information and text information of GNNs into LLMs through the LoRA fine - tuning technique, thereby achieving effective integration of multiple GNN models and using the powerful semantic understanding ability of LLMs to enhance the overall model performance. In this way, LensGNN can not only retain the semantic richness of node attributes but also optimize the use of multiple GNN models and improve the generalization ability of the model on a wide range of tasks and datasets. Experimental results show that LensGNN outperforms existing baseline models on multiple datasets, demonstrating its advantages in combining semantic and structural information.