DeLA-Drug: A Deep Learning Algorithm for Automated Design of Druglike Analogues

Teresa Maria Creanza,Giuseppe Lamanna,Pietro Delre,Marialessandra Contino,Nicola Corriero,Michele Saviano,Giuseppe Felice Mangiatordi,Nicola Ancona
DOI: https://doi.org/10.1021/acs.jcim.2c00205
IF: 6.162
2022-03-16
Journal of Chemical Information and Modeling
Abstract:In this paper, we present a deep learning algorithm for automated design of druglike analogues (DeLA-Drug), a recurrent neural network (RNN) model composed of two long short-term memory (LSTM) layers and conceived for data-driven generation of similar-to-bioactive compounds. DeLA-Drug captures the syntax of SMILES strings of more than 1 million compounds belonging to the ChEMBL28 database and, by employing a new strategy called sampling with substitutions (SWS), generates molecules starting from a single user-defined query compound. Remarkably, the algorithm preserves druglikeness and synthetic accessibility of the known bioactive compounds present in the ChEMBL28 repository. The absence of any time-demanding fine-tuning procedure enables DeLA-Drug to perform a fast generation of focused libraries for further high-throughput screening and makes it a suitable tool for performing de novo design even in low-data regimes. To provide a concrete idea of its applicability, DeLA-Drug was applied to the cannabinoid receptor subtype 2 (CB2R), a known target involved in different pathological conditions such as cancer and neurodegeneration. DeLA-Drug, available as a free web platform (http://www.ba.ic.cnr.it/softwareic/deladrugportal/), can help medicinal chemists interested in generating analogues of compounds already available in their laboratories and, for this reason, good candidates for an easy and low-cost synthesis.The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jcim.2c00205.Detailed description of the generative model architecture; number of valid and diverse compounds and QED returned by SWS (C = 8) and SWRS after generating 1,000,000 molecules (Table S1); the results of additional tests performed using as queries compounds with high/low lipophilicity and high/low molecular weight (PDF)Molecular formula strings (ZIP)This article has not yet been cited by other publications.
chemistry, multidisciplinary, medicinal,computer science, interdisciplinary applications, information systems
What problem does this paper attempt to address?