Expanding the Sequence Spaces of Synthetic Binding Protein Using Deep Learning-Based Framework ProteinMPNN

Yanlin Li,Wantong Jiao,Ruihan Liu,Xuejin Deng,Feng Zhu,Weiwei Xue
DOI: https://doi.org/10.1007/s11704-024-31060-3
IF: 2.6688
2024-01-01
Frontiers of Computer Science
Abstract:Synthetic binding proteins (SBPs) with small size, marked solubility and stability, and high affinity are important for protein-based research, treatment, and diagnostics. Over the last several decades, site-directed mutagenesis and directed evolution of privileged protein scaffold make up the great majority of SBPs. The groundbreaking advancement of deep learning (DL) in recent years has revolutionized the problem of protein structure prediction and design. Here, for the first time, the cutting-edge DL framework ProteinMPNN was applied to fulfill the de novo design of 7,245 new synthetic proteins covering 55 different scaffolds based on the original SBPs collected in our SYNBIP database. Comprehensive bioinformatics analysis indicated that, in addition to the excellent performance of sequence recovery, the designed synthetic proteins have a significant improvement in solubility and thermal stability compared to the currently known SBPs. Meanwhile, 8 incredibly suitable protein scaffolds for ProteinMPNN have been identified, from which the designed synthetic proteins calculate displayed good performance on binding ability to their corresponding protein targets. Therefore, the DL-based framework shown great potential in target-directed de novo generation of synthetic protein library with high quality, which could assist experimental biologists to rational protein engineering to discover novel functional protein binders.
What problem does this paper attempt to address?