Fast Rescoring Protocols to Improve the Performance of Structure-Based Virtual Screening Performed on Protein–Protein Interfaces
Natesh Singh,Ludovic Chaput,Bruno O. Villoutreix
DOI: https://doi.org/10.1021/acs.jcim.0c00545
IF: 6.162
2020-08-03
Journal of Chemical Information and Modeling
Abstract:Protein–protein interactions (PPIs) are attractive targets for drug design because of their essential role in numerous cellular processes and disease pathways. However, in general, PPIs display exposed binding pockets at the interface, and as such, have been largely unexploited for therapeutic interventions with low-molecular weight compounds. Here, we used docking and various rescoring strategies in an attempt to recover PPI inhibitors from a set of active and inactive molecules for 11 targets collected in ChEMBL and PubChem. Our focus is on the screening power of the various developed protocols and on using fast approaches so as to be able to apply such a strategy to the screening of ultralarge libraries in the future. First, we docked compounds into each target using the fast "pscreen" mode of the structure-based virtual screening (VS) package Surflex. Subsequently, the docking poses were postprocessed to derive a set of 3D topological descriptors: (i) shape similarity and (ii) interaction fingerprint similarity with a co-crystallized inhibitor, (iii) solvent-accessible surface area, and (iv) extent of deviation from the geometric center of a reference inhibitor. The derivatized descriptors, together with descriptor-scaled scoring functions, were utilized to investigate possible impacts on VS performance metrics. Moreover, four standalone scoring functions, RF-Score-VS (machine-learning), DLIGAND2 (knowledge-based), Vinardo (empirical), and X-SCORE (empirical), were employed to rescore the PPI compounds. Collectively, the results indicate that the topological scoring algorithms could be valuable both at a global level, with up to 79% increase in areas under the receiver operating characteristic curve for some targets, and in early stages, with up to a 4-fold increase in enrichment factors at 1% of the screened collections. Outstandingly, DLIGAND2 emerged as the best scoring function on this data set, outperforming all rescoring techniques in terms of VS metrics. The described methodology could help in the rational design of small-molecule PPI inhibitors and has direct applications in many therapeutic areas, including cancer, CNS, and infectious diseases such as COVID-19.The Supporting Information is available free of charge at <a class="ext-link" href="/doi/10.1021/acs.jcim.0c00545?goto=supporting-info">https://pubs.acs.org/doi/10.1021/acs.jcim.0c00545</a>.List of all the PDB entries of the protein–ligand complexes of the selected PPI targets, binding site properties of the selected protein structures, adjusted log AUC values, BEDROC values, enrichment factor values at 1 and 5% subsetting, percent of improvement in the AUC values, <i>p</i>-values obtained from comparison of the AUC and the log AUC values between Surflex and the different rescoring methods, Kendall τ correlation coefficient between the AUC values obtained using different scoring methods and the binding site properties of the PPI target proteins, AUC values for the data sets screened using DLIGAND2 and X-SCORE for the best-ranked poses according to the docking score and the best poses obtained from the rescoring of all poses, Pearson correlation coefficient between MW and Surflex-dock score, MW versus rSASA, regression coefficient between MW and Surflex-dock score, MW versus rSASA, network map illustrating the functional diversity of the PPI sites used in this study, KNIME workflow for binding site diversity analysis, comparison of the Cα RMSD-normalized distributions of the target proteins, PCA score plots of the PPI data sets, box plots showing the distribution of physicochemical properties of the actives and the inactives of the PPI data sets, linear ROC and semilogarithmic ROC curves for the docking of actives and inactives to the target proteins, Tanimoto similarity values of the actives and inactives with the reference PPI inhibitors, enrichment factor plots, performance strength of different scoring methods, PCA score plot of the PPI targets, and supporting methods (<a class="ext-link" href="/doi/suppl/10.1021/acs.jcim.0c00545/suppl_file/ci0c00545_si_001.pdf">PDF</a>)This article has not yet been cited by other publications.
chemistry, multidisciplinary, medicinal,computer science, interdisciplinary applications, information systems