Prediction of Chemical-Protein Interactions: Multitarget-Qsar Versus Computational Chemogenomic Methods

Feixiong Cheng,Yadi Zhou,Jie Li,Weihua Li,Guixia Liu,Yun Tang
DOI: https://doi.org/10.1039/c2mb25110h
2012-01-01
Molecular BioSystems
Abstract:Elucidation of chemical-protein interactions (CPI) is the basis of target identification and drug discovery. It is time-consuming and costly to determine CPI experimentally, and computational methods will facilitate the determination of CPI. In this study, two methods, multitarget quantitative structure-activity relationship (mt-QSAR) and computational chemogenomics, were developed for CPI prediction. Two comprehensive data sets were collected from the ChEMBL database for method assessment. One data set consisted of 81 689 CPI pairs among 50 924 compounds and 136 G-protein coupled receptors (GPCRs), while the other one contained 43 965 CPI pairs among 23 376 compounds and 176 kinases. The range of the area under the receiver operating characteristic curve (AUC) for the test sets was 0.95 to 1.0 and 0.82 to 1.0 for 100 GPCR mt-QSAR models and 100 kinase mt-QSAR models, respectively. The AUC of 5-fold cross validation were about 0.92 for both 176 kinases and 136 GPCRs using the chemogenomic method. However, the performance of the chemogenomic method was worse than that of mt-QSAR for the external validation set. Further analysis revealed that there was a high false positive rate for the external validation set when using the chemogenomic method. In addition, we developed a web server named CPI-Predictor, http://www.lmmd.org/online_services/cpi_predictor/, which is available for free. The methods and tool have potential applications in network pharmacology and drug repositioning.
What problem does this paper attempt to address?