Interface-aware molecular generative framework for protein-protein interaction modulators

Jianmin Wang,Jiashun Mao,Chunyan Li,Hongxin Xiang,Xun Wang,Shuang Wang,Zixu Wang,Yangyang Chen,Yuquan Li,Kyoung Tai No,Tao Song,Xiangxiang Zeng
DOI: https://doi.org/10.1101/2023.10.10.557742
2024-10-23
Abstract:Protein-protein interactions (PPIs) play a crucial role in numerous biochemical and biological processes. Although several structure-based molecular generative models have been developed, PPI interfaces and compounds targeting PPIs exhibit distinct physicochemical properties compared to traditional binding pockets and small-molecule drugs. As a result, generating compounds that effectively target PPIs, particularly by considering PPI complexes or interface hotspot residues, remains a significant challenge. In this work, we constructed a comprehensive dataset of PPI interfaces with active and inactive compound pairs. Based on this, we propose a novel molecular generative framework tailored to PPI interfaces, named GENiPPI. Our evaluation demonstrates that GENiPPI captures the implicit relationships between the PPI interfaces and the active molecules, and can generate novel compounds that target these interfaces. Moreover, GENiPPI can generate structurally diverse novel compounds with limited PPI interface modulators. To the best of our knowledge, this is the first exploration of a structure-based molecular generative model focused on PPI interfaces, which could facilitate the design of PPI modulators. The PPI interface-based molecular generative model enriches the existing landscape of structure-based (pocket/interface) molecular generative model.
Bioinformatics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to develop a molecular generation framework that can effectively generate compounds targeting the protein - protein interaction (PPI) interface. Specifically, the researchers noted that traditional structure - based molecular generation models face challenges when dealing with PPI interfaces because there are significant physicochemical property differences between PPI interfaces, traditional binding pockets, and small - molecule drugs. Therefore, designing a compound generation method that can take into account PPI complexes or interface hot - spot residues is an important scientific problem. ### Research Background and Problem Description 1. **Importance of PPI**: - Protein - protein interactions (PPIs) play a crucial role in numerous biochemical and biological processes. - PPI modulators can expand the drug target space and have great potential in drug discovery. 2. **Limitations of Existing Methods**: - Traditional structure - based rational design plays an important role in identifying lead compounds, but PPI targets and traditional drug targets have significant differences in biochemical characteristics (see Table 1). - Different structural features lead to differences in the physicochemical properties and drug - likeness of PPI modulators compared to conventional drugs. - Molecular generation models targeting PPI structures or interfaces are rarely reported in the literature. 3. **Specific Problems**: - How to generate novel compounds that can effectively target the PPI interface? - How to use deep - learning techniques to capture the implicit relationships between PPI interfaces and active molecules and generate compounds with diversity and novelty? ### Solution: GENiPPI Framework To solve the above problems, the researchers proposed a new molecular generation framework - GENiPPI (Interface - aware molecular generative framework for protein - protein interaction modulators). This framework is implemented through the following steps: 1. **Dataset Construction**: - A comprehensive dataset containing PPI interfaces and active/inactive compound pairs was constructed. 2. **Model Architecture**: - **GAT Module**: Graph Attention Networks (GATs) are used to capture the atomic - level interaction features of the protein complex interface. - **CNN Module**: Convolutional Neural Networks (CNNs) are employed to encode the representation of compounds in voxel and electron - density spaces. - **cWGAN Module**: The Conditional Wasserstein Generative Adversarial Network (cWGAN) is used to integrate these features and train to generate compound representations targeting the PPI interface. - **LSTM Decoder**: Long Short - Term Memory (LSTM) is used to decode the molecular embeddings into SMILES strings. 3. **Performance Evaluation**: - The effectiveness of the generated compounds is evaluated by multiple metrics, including QED, QEPPI, and Fsp3. - The performance of GENiPPI is compared with other generation models (such as LatentGAN and ORGAN) in terms of novelty, diversity, and effectiveness. ### Conclusion The GENiPPI framework demonstrates its superior performance in generating new compounds with drug - likeness and PPI - targeting properties, especially in few - shot generation tasks. This framework provides a powerful tool for structure - based PPI modulator design and enriches the existing field of structure - based molecular generation models. --- If you have more questions or need further explanation, please feel free to let me know!