Rapid prediction of key residues for foldability by machine learning model enables the design of highly functional libraries with hyperstable constrained peptide scaffolds
Fei Cai,Yuehua Wei,Daniel Kirchhofer,Andrew Chang,Yingnan Zhang
DOI: https://doi.org/10.1371/journal.pcbi.1012609
2024-11-24
PLoS Computational Biology
Abstract:Peptides are an emerging modality for developing therapeutics that can either agonize or antagonize cellular pathways associated with disease, yet peptides often suffer from poor chemical and physical stability, which limits their potential. However, naturally occurring disulfide-constrained peptides (DCPs) and de novo designed Hyperstable Constrained Peptides (HCPs) exhibiting highly stable and drug-like scaffolds, making them attractive therapeutic modalities. Previously, we established a robust platform for discovering peptide therapeutics by utilizing multiple DCPs as scaffolds. However, we realized that those libraries could be further improved by considering the foldability of peptide scaffolds for library design. We hypothesized that specific sequence patterns within the peptide scaffolds played a crucial role in spontaneous folding into a stable topology, and thus, these sequences should not be subject to randomization in the original library design. Therefore, we developed a method for designing highly diverse DCP libraries while preserving the inherent foldability of each scaffold. To achieve this, we first generated a large-scale dataset from yeast surface display (YSD) combined with shotgun alanine scan experiments to train a machine-learning (ML) model based on techniques used for natural language understanding. Then we validated the ML model with experiments, showing that it is able to not only predict the foldability of peptides with high accuracy across a broad range of sequences but also pinpoint residues critical for foldability. Using the insights gained from the alanine scanning experiment as well as prediction model, we designed a new peptide library based on a de novo -designed HCP, which was optimized for enhanced folding efficiency. Subsequent panning trials using this library yielded promising hits having good folding properties. In summary, this work advances peptide or small protein domain library design practices. These findings could pave the way for the efficient development of peptide-based therapeutics in the future. Peptides show promise as therapeutic agents for influencing cellular pathways, but they often lack stability. Disulfide-constrained peptides (DCPs) and de novo designed Hyperstable Constrained Peptides (HCPs) offer more stable and drug-like modality. Initially, we developed a platform for creating peptide therapeutics using DCPs. However, we recognized the need to improve peptide library design by preserving their ability to fold into stable molecules. We hypothesized that specific patterns in the peptide sequences were vital for proper folding and shouldn't be altered during randomization. To generate effective libraries, we created a method that keeps each scaffold's foldability intact. By combining yeast surface display (YSD) and alanine scanning, we trained a machine-learning model to predict peptide foldability and identify key residues. This model allowed us to design new peptide libraries with foldability optimized. Subsequent tests using this library produced promising results, demonstrating the potential of this method to generate powerful libraries for peptide therapeutic discovery.
biochemical research methods,mathematical & computational biology