Automating Biomedical Literature Review for Rapid Drug Discovery: Leveraging GPT-4 to Expedite Pandemic Response

Jingmei Yang,Kenji C. Walker,Ayse A. Bekar-Cesaretli,Boran Hao,Nahid Bhadelia,Diane Joseph-McCarthy,Ioannis Ch. Paschalidis
DOI: https://doi.org/10.1016/j.ijmedinf.2024.105500
IF: 4.73
2024-05-26
International Journal of Medical Informatics
Abstract:Objective The rapid expansion of the biomedical literature challenges traditional review methods, especially during outbreaks of emerging infectious diseases when quick action is critical. Our study aims to explore the potential of ChatGPT to automate the biomedical literature review for rapid drug discovery. Materials and Methods We introduce a novel automated pipeline helping to identify drugs for a given virus in response to a potential future global health threat. Our approach can be used to select PubMed articles identifying a drug target for the given virus. We tested our approach on two known pathogens: SARS-CoV-2, where the literature is vast, and Nipah, where the literature is sparse. Specifically, a panel of three experts reviewed a set of PubMed articles and labeled them as either describing a drug target for the given virus or not. The same task was given to the automated pipeline and its performance was based on whether it labeled the articles similarly to the human experts. We applied a number of prompt engineering techniques to improve the performance of ChatGPT. Results Our best configuration used GPT-4 by OpenAI and achieved an out-of-sample validation performance with accuracy/F1-score/sensitivity/specificity of 92.87%/88.43%/83.38%/97.82% for SARS-CoV-2 and 87.40%/73.90%/74.72%/91.36% for Nipah. Conclusion These results highlight the utility of ChatGPT in drug discovery and development and reveal their potential to enable rapid drug target identification during a pandemic-level health emergency.
health care sciences & services,computer science, information systems,medical informatics
What problem does this paper attempt to address?