Comprehensive detection and characterization of human druggable pockets through novel binding site descriptors

Arnau Comajuncosa-Creus,Guillem Jorba,Xavier Barril,Patrick Aloy
DOI: https://doi.org/10.1101/2024.03.14.584971
2024-03-16
Abstract:Druggable pockets are protein regions that have the ability to bind organic small molecules, and their characterization is essential in target-based drug discovery. However, strategies to derive pocket descriptors are scarce and usually exhibit limited applicability. Here, we present PocketVec, a novel approach to generate pocket descriptors for any protein binding site of interest through the inverse virtual screening of lead-like molecules. We assess the performance of our descriptors in a variety of scenarios, showing that it is on par with the best available methodologies, while overcoming some important limitations. In parallel, we systematically search for druggable pockets in the folded human proteome, using experimentally determined protein structures and AlphaFold2 models, identifying over 32,000 binding sites in more than 20,000 protein domains. Finally, we derive PocketVec descriptors for each small molecule binding site and run an all-against-all similarity search, exploring over 1.2 billion pairwise comparisons. We show how PocketVec descriptors facilitate the identification of druggable pocket similarities not revealed by structure- or sequence-based comparisons. Indeed, our analyses unveil dense clusters of similar pockets in distinct proteins for which no inhibitor has yet been crystalized, opening the door to strategies to prioritize the development of chemical probes to cover the druggable space.
Bioinformatics
What problem does this paper attempt to address?
The main focus of this paper is on identifying and characterizing druggable pockets in human proteomes for drug discovery. The research team proposes a new method called PocketVec, which generates descriptors for these pockets through reverse virtual screening. They evaluate the performance of the new method in various scenarios and compare it with existing methods. PocketVec is based on the assumption that similar pockets bind similar small molecules. It describes pocket features by converting the ranking of small molecules into vector form. This approach does not require ligands in crystal structures and is not dependent on specific structural or sequence alignments, overcoming limitations of some existing methods. The researchers systematically search for druggable pockets in the human proteome using experimental structures and the AlphaFold 2 model, identifying over 32,000 binding sites located in more than 20,000 protein domains. By searching for similarity among all binding sites, the PocketVec descriptors reveal clusters of similar pockets for uncrystallized inhibitors, providing a strategy for prioritizing the development of chemical probes to cover the druggable space. Additionally, the paper demonstrates pocket similarity uncovered by PocketVec that was not discovered based on structural and sequence alignments. In summary, the main contribution of this paper is the introduction of a new and interpretable method for generating pocket descriptors, enhancing the comprehensive understanding of protein binding sites and the potential for drug discovery.