Deep Learning Strategies for Enhanced Molecular Docking and Virtual Screening

Isabella Alvim Guedes,Matheus Müller Pereira da Silva,Fábio Lima Custódio,Eduardo Krempser,Laurent Emmanuel Dardenne
DOI: https://doi.org/10.26434/chemrxiv-2023-zfv87-v2
2023-11-08
Abstract:Over the last few years, machine learning (ML) and deep learning (DL) have been revolutionising the computer-aided drug discovery landscape. With the recent availability of the so-called ultra-large virtual libraries (libraries with up to billions of readily available virtual compounds), new ML and DL approaches have been developed to enable the exploration of these large chemical spaces, achieving promising results. Molecular docking is one of the most widely used computational methods for performing in silico screenings of virtual libraries. The two primary goals of molecular docking are to predict the correct binding pose of small molecules inside the binding pocket of a protein target and also estimate the binding affinity of the protein-ligand complex. In particular, DL methods have been applied in all aspects of protein-ligand molecular docking, from pose and binding affinity prediction to virtual screening campaigns, improving computational costs and accuracy. This chapter introduces the core aspects of the molecular docking methodology and some fundamental concepts of machine learning and deep learning. We also describe different types of molecular representations and DL architectures commonly employed in the field, such as convolutional and graph neural networks. Furthermore, we provide insights into potential applications by presenting related works from the scientific literature. Finally, we discuss the current limitations, challenges, and biases of DL applied to molecular docking.
Chemistry
What problem does this paper attempt to address?
The paper mainly discusses the application strategies of deep learning in molecular docking and virtual screening. Molecular docking is a widely used computational method in the early stages of drug discovery, aiming to predict the correct binding conformations and binding affinities of small molecules in protein target binding pockets. With the emergence of super large virtual libraries (containing billions of accessible virtual compounds), new machine learning and deep learning methods have been developed to explore these large chemical spaces and have made some progress. The paper introduces the basic aspects of molecular docking, including conformational search, scoring functions, benchmark datasets, and super large virtual libraries. At the same time, the paper also outlines concepts of machine learning and deep learning, such as learning from data, statistical measures of predictive performance, bias and variance in machine learning, regularization, and deep neural network architectures such as convolutional neural networks and graph neural networks. In addition, the paper discusses the application of deep learning in molecular docking, including predicting binding conformations, affinity prediction, and virtual screening. Although deep learning has improved computational efficiency and accuracy, there are still challenges such as the accuracy of protein-ligand binding affinity prediction, the problem of local optima, and the applicability to large-scale virtual screening. Finally, the paper points out the limitations, challenges, and biases of current deep learning methods applied to molecular docking, and provides concluding comments on future research directions.