Large Language Models in Drug Discovery: A Survey

Raghad AbuNasser
DOI: https://doi.org/10.26434/chemrxiv-2024-fmf9k
2024-10-03
Abstract:Drug Discovery is a very lengthy and resource-consuming process. However, a variety of advanced Artificial Intelligence (AI) and Deep Learning (DL) techniques are being utilized to accelerate and advance DD, such as Large Language Models (LLMs). This survey is in aim of discovering and comparing the currently available LLMs, their methodologies, used datasets, and the different tasks they are aiding in in the DD process, in particular; de novo drug design, drugtarget interaction prediction, masked language models, variational auto encoders, binding affinity prediction, drug repurposing, molecular optimization, activity prediction, contrastive learning for drug-target interaction prediction, and other miscellaneous models. This survey gives insights into future directions and potential in this area.
Chemistry
What problem does this paper attempt to address?
The paper aims to explore the application of large language models (LLMs) in drug discovery (DD) and conduct a comprehensive survey on the topic. Specifically, the paper attempts to address the following issues: 1. **Accelerating the drug discovery process**: Drug discovery is a time-consuming and resource-intensive process. The paper points out that utilizing advanced AI and deep learning technologies (especially large language models) can significantly speed up this process. 2. **Comparing existing LLMs models**: By comparing different LLMs methods, the datasets used, and the various tasks they perform in the drug discovery process (such as de novo drug design, drug-target interaction prediction, etc.), the paper evaluates the effectiveness of these models. 3. **Future directions and potential**: The paper also provides insights into future research directions, exploring the potential applications and development trends of LLMs in the field of drug discovery. The paper details several application cases of LLMs at various stages of drug discovery, including but not limited to: - De novo drug design - Drug-target interaction prediction - Binding affinity prediction - Drug repurposing - Molecular optimization Through these case studies, the paper demonstrates the tremendous potential of LLMs in accelerating the drug discovery process and highlights the current challenges, such as the issue of insufficient training data.