Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQA

Wenyu Huang,Guancheng Zhou,Hongru Wang,Pavlos Vougiouklis,Mirella Lapata,Jeff Z. Pan
2024-10-08
Abstract:Retrieval-Augmented Generation (RAG) is widely used to inject external non-parametric knowledge into large language models (LLMs). Recent works suggest that Knowledge Graphs (KGs) contain valuable external knowledge for LLMs. Retrieving information from KGs differs from extracting it from document sets. Most existing approaches seek to directly retrieve relevant subgraphs, thereby eliminating the need for extensive SPARQL annotations, traditionally required by semantic parsing methods. In this paper, we model the subgraph retrieval task as a conditional generation task handled by small language models. Specifically, we define a subgraph identifier as a sequence of relations, each represented as a special token stored in the language models. Our base generative subgraph retrieval model, consisting of only 220M parameters, achieves competitive retrieval performance compared to state-of-the-art models relying on 7B parameters, demonstrating that small language models are capable of performing the subgraph retrieval task. Furthermore, our largest 3B model, when plugged with an LLM reader, sets new SOTA end-to-end performance on both the WebQSP and CWQ benchmarks. Our model and data will be made available online: <a class="link-external link-https" href="https://github.com/hwy9855/GSR" rel="external noopener nofollow">this https URL</a>.
Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is in Knowledge Graph Question Answering (KGQA), how to use Small Language Models (SLMs) to effectively perform sub - graph retrieval tasks, and be able to be comparable to or even surpass large language models (LLMs) in terms of efficiency and effectiveness. Specifically, the paper focuses on the following points: 1. **Reduce the number of parameters**: Use small language models with fewer parameters to complete the sub - graph retrieval task, thereby reducing the computational cost and resource consumption. 2. **Improve efficiency**: Simplify the representation of the sub - graph retrieval task and reduce the number of tokens required to generate the relation chain, thereby improving the efficiency of training and inference. 3. **Enhance effectiveness**: Although the number of model parameters is reduced, through optimizing data processing and model training methods, it can still achieve or exceed the effectiveness of existing large language models. ### Main contributions 1. **Propose the GSR model**: Introduce a method named Generative Subgraph Retriever (GSR), which uses small language models to complete the sub - graph retrieval task. 2. **Design a training framework**: Propose a training framework that includes an indexing step and a retrieval step, including automatically collecting index data and two methods to enhance the quality of retrieval data. 3. **Experimental verification**: Through comprehensive experiments, the effectiveness of the GSR model is demonstrated. The best model has an F1 score improvement of +9.2% and +5.3% on the WebQSP and CWQ benchmark datasets respectively, and at the same time, the efficiency in the sub - graph retrieval step is increased by 7.7 times. ### Method overview 1. **Sub - graph definition**: Define the sub - graph as a multi - hop reasoning path from the topic entity to the answer entity, identified by a relation chain. 2. **Sub - graph retrieval modeling**: Model the sub - graph retrieval task as the task of generating sub - graph IDs, that is, predicting a series of relation IDs. 3. **Data processing**: - **Index data**: Construct a mapping task from natural language questions to relation IDs, which is used to train the model to understand the natural language expressions of different relations. - **Retrieval data**: Obtain high - quality training samples from weakly - supervised data through two methods: filtering and GPT selection. 4. **Training strategy**: Adopt a multi - task joint training strategy, and use index data and retrieval data to train the GSR model simultaneously. 5. **Inference strategy**: Use beam search to obtain the top k sub - graph IDs in the inference stage and retain the valid relation chains. ### Experimental results - **Sub - graph retrieval performance**: The GSR model performs excellently in the sub - graph retrieval task, especially when using data selected by GPT, with the highest F1 score. - **End - to - end performance**: On the WebQSP and CWQ datasets, the end - to - end performance of the GSR model combined with LLM readers is better than most of the existing baseline methods, especially in Hits@1 and F1 scores. In conclusion, through innovative methods and optimized data processing, this paper successfully demonstrates the potential of small language models in sub - graph retrieval tasks, which not only improves efficiency but also enhances effectiveness.