StackRAG Agent: Improving Developer Answers with Retrieval-Augmented Generation

Davit Abrahamyan,Fatemeh H. Fard
2024-06-20
Abstract:Developers spend much time finding information that is relevant to their questions. Stack Overflow has been the leading resource, and with the advent of Large Language Models (LLMs), generative models such as ChatGPT are used frequently. However, there is a catch in using each one separately. Searching for answers is time-consuming and tedious, as shown by the many tools developed by researchers to address this issue. On the other, using LLMs is not reliable, as they might produce irrelevant or unreliable answers (i.e., hallucination). In this work, we present StackRAG, a retrieval-augmented Multiagent generation tool based on LLMs that combines the two worlds: aggregating the knowledge from SO to enhance the reliability of the generated answers. Initial evaluations show that the generated answers are correct, accurate, relevant, and useful.
Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
The paper attempts to address the challenges developers face when searching for information related to their problems. Specifically, while Stack Overflow (SO) is a major information resource, the process of searching for answers is time-consuming and cumbersome. On the other hand, large language models (LLMs) like ChatGPT can generate answers, but these answers may be inaccurate or unreliable (i.e., hallucinations). Therefore, the paper proposes a tool that combines these two approaches—StackRAG, which aims to improve the reliability and accuracy of generated answers by leveraging the knowledge base of SO through Retrieval-Augmented Generation (RAG) technology. ### Specific problems include: 1. **Low search efficiency**: Manually searching for relevant information on Stack Overflow is very time-consuming and cumbersome. 2. **Reliability issues of LLMs**: The answers generated by LLMs may be inaccurate or unreliable, especially in the field of software development, where incorrect answers can lead to serious consequences. 3. **Outdated information**: The static training data of LLMs cannot keep up with the latest technological innovations, making them unable to provide the latest solutions. 4. **Multi-task processing**: Complex queries may need to be broken down into multiple sub-problems, and existing tools are not flexible enough in this regard. ### Solution: The paper proposes a multi-agent RAG tool called StackRAG, which combines the language generation capabilities of LLMs with the public knowledge base of Stack Overflow. The main goals of StackRAG are: - **Improve answer reliability**: By retrieving relevant Q&A pairs from Stack Overflow, ensure that the generated answers are based on actual development experience. - **Provide the latest information**: Utilize the real-time updates of Stack Overflow to ensure that the answers contain the latest technical information. - **Increase search efficiency**: Reduce the time developers spend on manual searches through automated and optimized retrieval processes. - **Handle complex queries**: Capable of breaking down complex queries into multiple sub-problems and handling each sub-problem separately. ### Preliminary evaluation results: Preliminary evaluations show that the answers generated by StackRAG are more correct, accurate, relevant, and useful than those generated by basic LLMs (such as GPT-4). This indicates that StackRAG has significant advantages in addressing the aforementioned issues.