A Survey of Graph Meets Large Language Model: Progress and Future Directions

Yuhan Li,Zhixun Li,Peisong Wang,Jia Li,Xiangguo Sun,Hong Cheng,Jeffrey Xu Yu
2024-04-24
Abstract:Graph plays a significant role in representing and analyzing complex relationships in real-world applications such as citation networks, social networks, and biological data. Recently, Large Language Models (LLMs), which have achieved tremendous success in various domains, have also been leveraged in graph-related tasks to surpass traditional Graph Neural Networks (GNNs) based methods and yield state-of-the-art performance. In this survey, we first present a comprehensive review and analysis of existing methods that integrate LLMs with graphs. First of all, we propose a new taxonomy, which organizes existing methods into three categories based on the role (i.e., enhancer, predictor, and alignment component) played by LLMs in graph-related tasks. Then we systematically survey the representative methods along the three categories of the taxonomy. Finally, we discuss the remaining limitations of existing studies and highlight promising avenues for future research. The relevant papers are summarized and will be consistently updated at: <a class="link-external link-https" href="https://github.com/yhLeeee/Awesome-LLMs-in-Graph-tasks" rel="external noopener nofollow">this https URL</a>.
Machine Learning,Computation and Language,Social and Information Networks
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to effectively combine large - language models (LLMs) with graph neural networks (GNNs) in graph data processing and analysis, in order to overcome the limitations of traditional GNNs when dealing with complex graph - structured data and improve the performance of graph - related tasks. Specifically, the paper focuses on the following aspects: 1. **Enhancing Node Representations**: By leveraging the powerful text - understanding capabilities of LLMs, generate higher - quality node embeddings, thereby improving the classification performance of GNNs. For example, use LLMs to generate explanations or pseudo - labels to enrich the text attributes of nodes, or directly obtain text embeddings from LLMs as initial node embeddings. 2. **Prediction Tasks**: Explore how to use LLMs for graph - related prediction tasks, such as classification and reasoning. This includes converting the graph structure into a text description so that LLMs can process it directly, or combining the structural features extracted by GNNs for prediction. 3. **Alignment Techniques**: Research how to combine LLMs and GNNs in the same vector space through alignment techniques, so as to better capture the structural and semantic information of graph data. For example, by introducing relation prior tokens or using a language encoder to model the shared knowledge among different relations. 4. **Systematic Review and Future Directions**: Provide a comprehensive review and analysis, introduce existing methods of combining LLMs with graph data, and propose future research directions. The paper proposes a new taxonomy, classifying existing methods into three categories: Enhancer, Predictor, and Alignment Component. Through research in these aspects, the paper aims to provide new perspectives and methods for the field of graph data processing and analysis and promote the further development of this field.