EasyRAG: Efficient Retrieval-Augmented Generation Framework for Automated Network Operations

Zhangchi Feng,Dongdong Kuang,Zhongyuan Wang,Zhijie Nie,Yaowei Zheng,Richong Zhang
2024-10-15
Abstract:This paper presents EasyRAG, a simple, lightweight, and efficient retrieval-augmented generation framework for automated network operations. Our framework has three advantages. The first is accurate question answering. We designed a straightforward RAG scheme based on (1) a specific data processing workflow (2) dual-route sparse retrieval for coarse ranking (3) LLM Reranker for reranking (4) LLM answer generation and optimization. This approach achieved first place in the GLM4 track in the preliminary round and second place in the GLM4 track in the semifinals. The second is simple deployment. Our method primarily consists of BM25 retrieval and BGE-reranker reranking, requiring no fine-tuning of any models, occupying minimal VRAM, easy to deploy, and highly scalable; we provide a flexible code library with various search and generation strategies, facilitating custom process implementation. The last one is efficient inference. We designed an efficient inference acceleration scheme for the entire coarse ranking, reranking, and generation process that significantly reduces the inference latency of RAG while maintaining a good level of accuracy; each acceleration scheme can be plug-and-play into any component of the RAG process, consistently enhancing the efficiency of the RAG system. Our code and data are released at \url{<a class="link-external link-https" href="https://github.com/BUAADreamer/EasyRAG" rel="external noopener nofollow">this https URL</a>}.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to improve the efficiency, accuracy, and ease of deployment of the Retrieval - Augmented Generation (RAG) framework in automated network operation and maintenance. Specifically, the paper proposes a framework named EasyRAG, which aims to solve the problem in the following three aspects: 1. **Accurate question answering**: A simple RAG scheme based on a specific data processing flow, two - path sparse retrieval, LLM re - ranker, and LLM answer generation and optimization is designed. This method won first place in the GLM4 track in the preliminary round and second place in the semi - finals. 2. **Simple deployment**: The proposed method mainly consists of BM25 retrieval and BGE re - ranking. It does not require fine - tuning of any models, occupies very little video memory, is easy to deploy, and is highly scalable. A flexible code library is provided, which supports various search and generation strategies and facilitates the implementation of the customization process. 3. **Efficient inference**: An efficient inference acceleration scheme for the entire coarse - ranking, re - ranking, and generation processes is designed, which significantly reduces the RAG inference latency while maintaining good accuracy. Each acceleration scheme can be plug - in integrated into any component of the RAG process to continuously improve the efficiency of the RAG system. Overall, the goal of the paper is to provide a lightweight, efficient, and easy - to - deploy solution through the EasyRAG framework to meet the challenges in automated network operation and maintenance.