Practical Applications of Deep Learning To Impute Heterogeneous Drug Discovery Data
Benedict W. J. Irwin,Julian R. Levell,Thomas M. Whitehead,Matthew D. Segall,Gareth J. Conduit
DOI: https://doi.org/10.1021/acs.jcim.0c00443
IF: 6.162
2020-06-01
Journal of Chemical Information and Modeling
Abstract:Contemporary deep learning approaches still struggle to bring a useful improvement in the field of drug discovery because of the challenges of sparse, noisy, and heterogeneous data that are typically encountered in this context. We use a state-of-the-art deep learning method, Alchemite, to impute data from drug discovery projects, including multitarget biochemical activities, phenotypic activities in cell-based assays, and a variety of absorption, distribution, metabolism, and excretion (ADME) endpoints. The resulting model gives excellent predictions for activity and ADME endpoints, offering an average increase in <i>R</i><sup>2</sup> of 0.22 versus quantitative structure–activity relationship methods. The model accuracy is robust to combining data across uncorrelated endpoints and projects with different chemical spaces, enabling a single model to be trained for all compounds and endpoints. We demonstrate improvements in accuracy on the latest chemistry and data when updating models with new data as an ongoing medicinal chemistry project progresses.The Supporting Information is available free of charge at <a class="ext-link" href="/doi/10.1021/acs.jcim.0c00443?goto=supporting-info">https://pubs.acs.org/doi/10.1021/acs.jcim.0c00443</a>.Description of the data set in terms of chemical diversity, chemical series, distributions and common chemical properties, and assay values (<a class="ext-link" href="/doi/suppl/10.1021/acs.jcim.0c00443/suppl_file/ci0c00443_si_001.pdf">PDF</a>)This article has not yet been cited by other publications.
chemistry, multidisciplinary, medicinal,computer science, interdisciplinary applications, information systems