Discovery of Dual FGFR4 and EGFR Inhibitors by Machine Learning and Biological Evaluation
Xingye Chen,Wuchen Xie,Yan Yang,Yi Hua,GuoMeng Xing,Li Liang,Chenglong Deng,Yuchen Wang,Yuanrong Fan,Haichun Liu,Tao Lu,Yadong Chen,Yanmin Zhang
DOI: https://doi.org/10.1021/acs.jcim.0c00652
IF: 6.162
2020-09-14
Journal of Chemical Information and Modeling
Abstract:Kinase inhibitors are widely used in antitumor research, but there are still many problems such as drug resistance and off-target toxicity. A more suitable solution is to design a multitarget inhibitor with certain selectivity. Herein, computational and experimental studies were applied to the discovery of dual inhibitors against FGFR4 and EGFR. A quantitative structure–property relationship (QSPR) study was carried out to predict the FGFR4 and EGFR activity of a data set consisting of 843 and 5088 compounds, respectively. Four different machine learning methods including support vector machine (SVM), random forest (RF), gradient boost regression tree (GBRT), and XGBoost (XGB) were built using the most suitable features selected by the mutual information algorithm. As for FGFR4 and EGFR, SVM showed the best performance with <i>R</i><sup>2</sup><sub>test-FGFR4</sub> = 0.80 and <i>R</i><sup>2</sup><sub>test-EGFR</sub> = 0.75, demonstrating excellent model stability, which was used to predict the activity of some compounds from an in-house database. Finally, compound <b>1</b> was selected, which exhibits inhibitory activity against FGFR4 (IC<sub>50</sub> = 86.2 nM) and EGFR (IC<sub>50</sub> = 83.9 nM) kinase, respectively. Furthermore, molecular docking and molecular dynamics simulations were performed to identify key amino acids for the interaction of compound <b>1</b> with FGFR4 and EGFR. In this paper, the machine-learning-based QSAR models were established and effectively applied to the discovery of dual-target inhibitors against FGFR4 and EGFR, demonstrating the great potential of machine learning strategies in dual inhibitor discovery.The Supporting Information is available free of charge at <a class="ext-link" href="/doi/10.1021/acs.jcim.0c00652?goto=supporting-info">https://pubs.acs.org/doi/10.1021/acs.jcim.0c00652</a>.Supporting Methods: <i>Y</i>-randomization test (p 2). Supporting Figures: Figure S1. The relationship between <i>Q</i><sup>2</sup> and the number of descriptors for FGFR4. Figure S2. Distribution of <i>Q</i><sup>2</sup> of randomized models compared with the true model in the <i>Y</i>-randomization test. Figure S3. Correlograms between each important descriptor and permeability values of two activity prediction modeling: (a) FGFR4, (b) EGFR. Figure S4. Comparison of the molecular similarity between the training set and the external set: (a) FGFR4 data set, (b) EGFR data set. Figure S5. Structures of compounds <b>1</b>–<b>3</b>. Figures S6 and S7. The chemical structures of modeling molecules of the FGFR4/EGFR activity prediction model. Figure S8. The chemical structures of 23 compounds external verification set. Figure S9. Docking results of FGFR4 and EGFR. (<a class="ext-link" href="/doi/suppl/10.1021/acs.jcim.0c00652/suppl_file/ci0c00652_si_001.pdf">PDF</a>)Table S1. Experimental and predicted IC<sub>50</sub> values of FGFR4 of the whole data set that consisted of 843 compounds. Table S2. Experimental and predicted IC<sub>50</sub> values of EGFR of the whole data set that consisted of 5088 compounds. Table S3. Detailed information for the external validation set. Table S4. Statistical analysis of 180 descriptors selected in FGFR4 activity prediction. Table S5. Statistical analysis of 280 descriptors selected in EGFR activity prediction. Table S6. Detailed information on the most essential variables for FGFR4 prediction activity modeling. Table S7. Detailed information on the most essential variables for EGFR prediction activity modeling. Table S8. The original data for cross-docking of FGFR4 and EGFR. Table S9. Residue-specific binding free energies of FGFR4 (PDB code: <a href="https://www.rcsb.org/pdb/search/structidSearch.do?structureId=4UXQ">4UXQ</a>) and compound <b>1</b>. Table S10. Residue-specific binding free energies of EGFR (PDB code: <a href="https://www.rcsb.org/pdb/search/structidSearch.do?structureId=4ZAU">4ZAU</a>) and compound <b>1</b>. (<a class="ext-link" href="/doi/suppl/10.1021/acs.jcim.0c00652/suppl_file/ci0c00652_si_002.xlsx">XLSX</a>)This article has not yet been cited by other publications.
chemistry, multidisciplinary, medicinal,computer science, interdisciplinary applications, information systems