Telco-DPR: A Hybrid Dataset for Evaluating Retrieval Models of 3GPP Technical Specifications

Thaina Saraiva,Marco Sousa,Pedro Vieira,António Rodrigues
2024-10-16
Abstract:This paper proposes a Question-Answering (QA) system for the telecom domain using 3rd Generation Partnership Project (3GPP) technical documents. Alongside, a hybrid dataset, Telco-DPR, which consists of a curated 3GPP corpus in a hybrid format, combining text and tables, is presented. Additionally, the dataset includes a set of synthetic question/answer pairs designed to evaluate the retrieval performance of QA systems on this type of data. The retrieval models, including the sparse model, Best Matching 25 (BM25), as well as dense models, such as Dense Passage Retriever (DPR) and Dense Hierarchical Retrieval (DHR), are evaluated and compared using top-K accuracy and Mean Reciprocal Rank (MRR). The results show that DHR, a retriever model utilising hierarchical passage selection through fine-tuning at both the document and passage levels, outperforms traditional methods in retrieving relevant technical information, achieving a Top-10 accuracy of 86.2%. Additionally, the Retriever-Augmented Generation (RAG) technique, used in the proposed QA system, is evaluated to demonstrate the benefits of using the hybrid dataset and the DHR. The proposed QA system, using the developed RAG model and the Generative Pretrained Transformer (GPT)-4, achieves a 14% improvement in answer accuracy, when compared to a previous benchmark on the same dataset.
Information Retrieval,Computation and Language,Machine Learning,Networking and Internet Architecture
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is in the field of telecommunications, how to effectively build a question - answering system (QA system) that can accurately answer questions related to 3GPP technical specifications. Specifically, the main challenges faced by researchers include: 1. **Handling professional technical documents**: 3GPP technical documents contain a large number of professional terms, complex protocols and structural elements (such as tables), which make it difficult for traditional large - language models (LLMs) to understand and accurately retrieve relevant information. 2. **Improving retrieval performance**: The existing Retrieval - Augmented Generation (RAG) framework has limitations when dealing with highly - structured telecommunications standard documents, especially when dealing with data mixed with text and tables. Therefore, a new method is needed to improve the retrieval and understanding of these documents. To solve these problems, the paper proposes the following solutions: - **Telco - DPR dataset**: A dataset in a mixed format is constructed, which contains a carefully - curated 3GPP corpus and synthetic question/answer pairs for evaluating the performance of retrieval models. This dataset combines text and table information and aims to more realistically reflect the actual 3GPP document structure. - **Optimized QA system architecture**: A QA system based on the RAG framework is proposed, which uses a hierarchical retriever (DHR) to select relevant paragraphs and generates the final answer through a generative reader. In particular, the DHR model significantly improves the accuracy of retrieving relevant technical information by fine - tuning at the document and paragraph levels. - **Experimental verification**: The performance of different retrieval models (such as BM25, DPR, DHR, etc.) is evaluated through a series of experiments, and the superiority of the proposed DHR model in Top - 10 precision and Mean Reciprocal Rank (MRR) is demonstrated. In addition, the performance of the RAG system using GPT - 4 as a generator is compared with that of the benchmark system, and the results show that the former achieves a 14% improvement in accuracy on the MCQ dataset. In summary, this research aims to overcome the difficulties encountered by existing LLMs when dealing with specific fields (such as mobile wireless networks) and provides new ideas and technical means for developing more intelligent and efficient QA systems.