Optimizing Pharmacokinetic Property Prediction Based on Integrated Datasets and a Deep Learning Approach
Xiting Wang,Meng Liu,Lan Zhang,Yun Wang,Yu Li,Tao Lu
DOI: https://doi.org/10.1021/acs.jcim.0c00568
IF: 6.162
2020-08-17
Journal of Chemical Information and Modeling
Abstract:Oral bioavailability (OBA)-related pharmacokinetic properties, such as aqueous solubility, lipophilicity, and intestinal membrane permeability, play a significant role in drug discovery. However, their measurement is usually costly and time-consuming. Therefore, prediction models based on diverse approaches have been established in recent decades. Computational prediction of molecular properties has become an important step in drug discovery, aiming to identify potential drug-like candidates and reduce costs. However, limitations related to dataset capacity and algorithm adaptation still place restrictions on the applicability of the related models. In this study, we considered both dataset and algorithm optimization to address the challenge of predicting OBA-related molecular properties. Benchmark datasets of aqueous solubility (log <i>S</i>), lipophilicity (log <i>D</i>), and membrane permeability measured using the Caco-2 cell line (log <i>P</i><sub>app</sub>) were constructed by merging and calibrating experimental data from diverse articles and databases. Then, a novel molecular property prediction model, called a multiembedding-based synthetic network (MESN), was generated by applying a deep learning algorithm based on the synthesis of multiple types of molecular embeddings. MESN achieves performance improvements over other state-of-the-art methods for the prediction of aqueous solubility, lipophilicity, and membrane permeability. Results were also obtained using several other algorithms and independent validation datasets as a control study. Moreover, a dimension reduction analysis (based on t-distributed stochastic neighbor embedding, t-SNE) and an atomic feature similarity analysis showed that the molecular embeddings extracted from the MESN model exhibit good clustering and diversity. Overall, considering the fundamental role of the data and the superior prediction performance of the model, we highlight the applicability of MESN on benchmark datasets for further utility in drug discovery-related molecular property prediction.The Supporting Information is available free of charge at <a class="ext-link" href="/doi/10.1021/acs.jcim.0c00568?goto=supporting-info">https://pubs.acs.org/doi/10.1021/acs.jcim.0c00568</a>.Supplementary tables for lipophilicity (Log <i>D</i>) (<a class="ext-link" href="/doi/suppl/10.1021/acs.jcim.0c00568/suppl_file/ci0c00568_si_001.xls">XLS</a>)Supplementary tables for membrane permeability (Log <i>P</i><sub>app</sub>) (<a class="ext-link" href="/doi/suppl/10.1021/acs.jcim.0c00568/suppl_file/ci0c00568_si_002.xls">XLS</a>)Supplementary tables for aqueous solubility (Log <i>S</i>) (<a class="ext-link" href="/doi/suppl/10.1021/acs.jcim.0c00568/suppl_file/ci0c00568_si_003.xls">XLS</a>)This article has not yet been cited by other publications.
chemistry, multidisciplinary, medicinal,computer science, interdisciplinary applications, information systems