Prediction of P2Y12 Antagonists Using a Novel Genetic Algorithm-Support Vector Machine Coupled Approach.

Ming Hao,Yan Li,Yonghua Wang,Shuwei Zhang
DOI: https://doi.org/10.1016/j.aca.2011.02.004
IF: 6.911
2011-01-01
Analytica Chimica Acta
Abstract:Presently, a genetic algorithm (GA)-support vector machine (SVM) coupled approach is proposed for optimizing the 2D molecular descriptor subset generated for series of P2Y(12) (members of the G-protein-coupled receptor family) antagonists, with the statistical performance and efficiency of the model being simultaneously enhanced by SVM kernel-based nonlinear projection. As we know, this is the first QSAR study for prediction of P2Y(12) inhibition activity based on an unusually large dataset of 364 P2Y(12) antagonists with diversity of structures. In addition, three other widely used approaches, i.e., partial least squares (PLS), random forest (RF), and Gaussian process (GP) routines combined with GA (namely, GA-PLS, GA-RF, GA-GP, respectively) are also employed and compared with the GA-SVM method in terms of several rigorous evaluation criteria. The obtained results indicate that the GA-SVM model is a powerful tool for prediction of P2Y(12) antagonists, producing a conventional correlation coefficient R(2) of 0.976 and R(cv)(2) (cross-validation) of 0.829 for the training set as well as R(pred)(2) of 0.811 for the test set, which significantly outperforms the other three methods with the average R(2)=0.894, R(cv)(2)=0.741, R(pred)(2)=0.693. The proposed model with excellent prediction capacity from both the internal to external quality should be helpful for screening and optimization of potential P2Y(12) antagonists prior to chemical synthesis in drug development.
What problem does this paper attempt to address?