G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering

Xiaoxin He,Yijun Tian,Yifei Sun,Nitesh V. Chawla,Thomas Laurent,Yann LeCun,Xavier Bresson,Bryan Hooi
2024-03-14
Abstract:Given a graph with textual attributes, we enable users to `chat with their graph': that is, to ask questions about the graph using a conversational interface. In response to a user's questions, our method provides textual replies and highlights the relevant parts of the graph. While existing works integrate large language models (LLMs) and graph neural networks (GNNs) in various ways, they mostly focus on either conventional graph tasks (such as node, edge, and graph classification), or on answering simple graph queries on small or synthetic graphs. In contrast, we develop a flexible question-answering framework targeting real-world textual graphs, applicable to multiple applications including scene graph understanding, common sense reasoning, and knowledge graph reasoning. Toward this goal, we first develop our Graph Question Answering (GraphQA) benchmark with data collected from different tasks. Then, we propose our G-Retriever approach, which integrates the strengths of GNNs, LLMs, and Retrieval-Augmented Generation (RAG), and can be fine-tuned to enhance graph understanding via soft prompting. To resist hallucination and to allow for textual graphs that greatly exceed the LLM's context window size, G-Retriever performs RAG over a graph by formulating this task as a Prize-Collecting Steiner Tree optimization problem. Empirical evaluations show that our method outperforms baselines on textual graph tasks from multiple domains, scales well with larger graph sizes, and resists hallucination. (Our codes and datasets are available at:
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to achieve interactive question - answering between users and textual graphs, especially for complex and real - world textual graph applications**. Specifically, the paper aims to develop a flexible question - answering framework that enables users to "chat" with the graph through a dialogue interface, thereby asking questions about the graph and obtaining accurate and relevant text responses and highlighting of relevant parts in the graph. ### Problem Background Although existing research work has combined large - language models (LLMs) and graph neural networks (GNNs), most of them focus on traditional graph tasks (such as node classification, edge classification, and graph classification), or are limited to answering simple questions on small - scale or synthetic graphs. These methods cannot handle large - scale, complex real - world textual graphs well. ### Core Contributions of the Paper 1. **New GraphQA Benchmark**: In order to evaluate the graph question - answering ability of models in different fields, the authors introduced a diverse GraphQA benchmark, covering multiple practical application scenarios, such as common - sense reasoning, scene - graph understanding, and knowledge - graph reasoning. 2. **G - Retriever Architecture**: The authors proposed G - Retriever, a new framework that combines GNNs, LLMs, and retrieval - augmented generation (RAG). G - Retriever enhances graph - understanding ability through soft - prompt fine - tuning and reduces the hallucination phenomenon by directly retrieving information from the graph. 3. **Advanced Graph Retrieval Technique**: In order to address the shortcomings of existing RAG methods in handling graph data, the authors proposed a sub - graph retrieval method based on the prize - collecting Steiner tree (PCST) optimization problem. This method can effectively select the sub - graph most relevant to the query while keeping the size of the graph manageable. 4. **Empirical Research**: The authors verified the effectiveness and efficiency of G - Retriever through experiments in multiple fields, proving that it outperforms baseline methods in handling large - scale textual graphs and can effectively resist hallucination. ### Key Technologies of the Solution - **Retrieval - Augmented Generation (RAG)**: By directly retrieving relevant information from the graph, it avoids the hallucination problem that may occur in LLM due to context - window limitations. - **Sub - graph Construction**: Use the PCST algorithm to construct a sub - graph that contains as many relevant nodes and edges as possible, ensuring that the information input to the LLM is both comprehensive and concise. - **Graph Encoding and Generation**: Use GNN to encode the sub - graph and generate the final answer through the LLM to ensure the accuracy and interpretability of the answer. ### Summary The main goal of this paper is to develop a question - answering system that can handle complex real - world textual graphs, allowing users to interact with the graph through natural language and obtain accurate and meaningful answers. By introducing the new GraphQA benchmark and G - Retriever architecture, the authors not only solve the limitations of existing methods but also provide new directions and tools for future research.