Abstract:Peptides are an emerging modality for developing therapeutics that can either agonize or antagonize cellular pathways associated with disease, yet peptides often suffer from poor chemical and physical stability, which limits their potential. However, naturally occurring disulfide-constrained peptides (DCPs) and de novo designed Hyperstable Constrained Peptides (HCPs) exhibiting highly stable and drug-like scaffolds, making them attractive therapeutic modalities. Previously, we established a robust platform for discovering peptide therapeutics by utilizing multiple DCPs as scaffolds. However, we realized that those libraries could be further improved by considering the foldability of peptide scaffolds for library design. We hypothesized that specific sequence patterns within the peptide scaffolds played a crucial role in spontaneous folding into a stable topology, and thus, these sequences should not be subject to randomization in the original library design. Therefore, we developed a method for designing highly diverse DCP libraries while preserving the inherent foldability of each scaffold. To achieve this, we first generated a large-scale dataset from yeast surface display (YSD) combined with shotgun alanine scan experiments to train a machine-learning (ML) model based on techniques used for natural language understanding. Then we validated the ML model with experiments, showing that it is able to not only predict the foldability of peptides with high accuracy across a broad range of sequences but also pinpoint residues critical for foldability. Using the insights gained from the alanine scanning experiment as well as prediction model, we designed a new peptide library based on a de novo -designed HCP, which was optimized for enhanced folding efficiency. Subsequent panning trials using this library yielded promising hits having good folding properties. In summary, this work advances peptide or small protein domain library design practices. These findings could pave the way for the efficient development of peptide-based therapeutics in the future. Peptides show promise as therapeutic agents for influencing cellular pathways, but they often lack stability. Disulfide-constrained peptides (DCPs) and de novo designed Hyperstable Constrained Peptides (HCPs) offer more stable and drug-like modality. Initially, we developed a platform for creating peptide therapeutics using DCPs. However, we recognized the need to improve peptide library design by preserving their ability to fold into stable molecules. We hypothesized that specific patterns in the peptide sequences were vital for proper folding and shouldn't be altered during randomization. To generate effective libraries, we created a method that keeps each scaffold's foldability intact. By combining yeast surface display (YSD) and alanine scanning, we trained a machine-learning model to predict peptide foldability and identify key residues. This model allowed us to design new peptide libraries with foldability optimized. Subsequent tests using this library produced promising results, demonstrating the potential of this method to generate powerful libraries for peptide therapeutic discovery.

Investigating Active Learning and Meta-Learning for Iterative Peptide Design

Improved design and screening of high bioactivity peptides for drug discovery

Accelerating bioactive peptide discovery via mutual information-based meta-learning

Meta learning addresses noisy and under-labeled data in machine learning-guided antibody engineering

Machine Learning to Develop Peptide Catalysts─Successes, Limitations, and Opportunities

AutoPeptideML: A study on how to build more trustworthy peptide bioactivity predictors

Discovering de novo peptide substrates for enzymes using machine learning

Combining genetic algorithm with machine learning strategies for designing potent antimicrobial peptides

Reinforcement learning-driven exploration of peptide space: accelerating generation of drug-like peptides

Active Finetuning Protein Language Model: A Budget-Friendly Method for Directed Evolution

Machine learning-guided discovery and design of non-hemolytic peptides

Meta Learning for Low-Resource Molecular Optimization

Improving few-shot learning-based protein engineering with evolutionary sampling

Deep Learning-Based Bioactive Therapeutic Peptide Generation and Screening

Computational Site Saturation Mutagenesis of Canonical and Non-Canonical Amino Acids to Probe Protein-Peptide Interactions

Target specific peptide design using latent space approximate trajectory collector

High Throughput Meta-analysis of Antimicrobial Peptides for Characterizing Class Specific Therapeutic Candidates: An Approach

Active learning for affinity prediction of antibodies

Active learning for energy-based antibody optimization and enhanced screening

Protocol for iterative optimization of modified peptides bound to protein targets

Rapid prediction of key residues for foldability by machine learning model enables the design of highly functional libraries with hyperstable constrained peptide scaffolds