PROMPTHEUS: A Human-Centered Pipeline to Streamline SLRs with LLMs

João Pedro Fernandes Torres,Catherine Mulligan,Joaquim Jorge,Catarina Moreira
2024-10-22
Abstract:The growing volume of academic publications poses significant challenges for researchers conducting timely and accurate Systematic Literature Reviews, particularly in fast-evolving fields like artificial intelligence. This growth of academic literature also makes it increasingly difficult for lay people to access scientific knowledge effectively, meaning academic literature is often misrepresented in the popular press and, more broadly, in society. Traditional SLR methods are labor-intensive and error-prone, and they struggle to keep up with the rapid pace of new research. To address these issues, we developed \textit{PROMPTHEUS}: an AI-driven pipeline solution that automates the SLR process using Large Language Models. We aimed to enhance efficiency by reducing the manual workload while maintaining the precision and coherence required for comprehensive literature synthesis. PROMPTHEUS automates key stages of the SLR process, including systematic search, data extraction, topic modeling using BERTopic, and summarization with transformer models. Evaluations conducted across five research domains demonstrate that PROMPTHEUS reduces review time, achieves high precision, and provides coherent topic organization, offering a scalable and effective solution for conducting literature reviews in an increasingly crowded research landscape. In addition, such tools may reduce the increasing mistrust in science by making summarization more accessible to laypeople. The code for this project can be found on the GitHub repository at <a class="link-external link-https" href="https://github.com/joaopftorres/PROMPTHEUS.git" rel="external noopener nofollow">this https URL</a>
Artificial Intelligence
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the challenges faced during the process of Systematic Literature Reviews (SLRs). Specifically, with the sharp increase in academic publications, researchers have encountered significant difficulties in conducting timely and accurate SLRs, especially in rapidly developing fields such as artificial intelligence. These problems include: 1. **High manual workload**: Traditional SLR methods rely on a large amount of manual labor, are error - prone and difficult to keep up with the rapid pace of new research. 2. **Difficulty in literature screening**: Faced with a vast amount of academic literature, screening relevant literature has become increasingly difficult, leading to misinterpretation and misreporting of academic literature in the mass media and society. 3. **Lack of integrated automation tools**: Although there are already tools that can assist in literature search and summarization, these tools usually operate independently and cannot fully automate the entire SLR process, and human processing is still required at key stages. To solve these problems, the author has developed PROMPTHEUS: an AI - driven pipeline solution based on large language models (LLMs) for automating the SLR process. The goal of PROMPTHEUS is to improve efficiency by reducing the manual workload while maintaining the precision and coherence required for comprehensive literature synthesis. Its main functions include: - **Systematic search and screening**: Automatically retrieve and filter relevant literature using techniques such as GPT and Sentence - BERT. - **Data extraction and topic modeling**: Use BERTopic for topic modeling to ensure clear organization of information. - **Synthesis and summarization**: Generate coherent literature summaries through models such as T5 and GPT. Through these automated steps, PROMPTHEUS can significantly reduce the review time, achieve high precision, and provide coherent topic organization, thus providing a scalable and effective solution for conducting literature reviews in an increasingly crowded research environment. In addition, such a tool can also reduce public distrust of science by simplifying the summarization process. ### Key contributions - **Novel integration of SLR stages**: Proposed a fully automated SLR method that integrates multiple stages (search, extraction, synthesis) into an end - to - end process, using advanced natural language processing (NLP) techniques. - **High precision in literature retrieval**: Utilize state - of - the - art language models to improve the accuracy of literature retrieval, ensuring that researchers obtain high - quality and highly relevant research. - **Structured topic modeling**: Use BERTopic for topic modeling to make information extraction and organization more clear and orderly. - **Comprehensive evaluation**: Evaluate through multiple metrics (such as ROUGE scores, Flesch readability scores, cosine similarity, and topic consistency) to prove the effectiveness of PROMPTHEUS in the automated SLR process while maintaining high accuracy and improving the readability of the content. By automating the most time - consuming parts of SLRs, PROMPTHEUS aims to make SLRs more efficient and comprehensive, ultimately allowing researchers to have more time to focus on innovative and high - impact research while ensuring that they can keep up with key developments.