Full-Atom Peptide Design based on Multi-modal Flow Matching

Jiahan Li,Chaoran Cheng,Zuofan Wu,Ruihan Guo,Shitong Luo,Zhizhou Ren,Jian Peng,Jianzhu Ma
2024-06-02
Abstract:Peptides, short chains of amino acid residues, play a vital role in numerous biological processes by interacting with other target molecules, offering substantial potential in drug discovery. In this work, we present PepFlow, the first multi-modal deep generative model grounded in the flow-matching framework for the design of full-atom peptides that target specific protein receptors. Drawing inspiration from the crucial roles of residue backbone orientations and side-chain dynamics in protein-peptide interactions, we characterize the peptide structure using rigid backbone frames within the $\mathrm{SE}(3)$ manifold and side-chain angles on high-dimensional tori. Furthermore, we represent discrete residue types in the peptide sequence as categorical distributions on the probability simplex. By learning the joint distributions of each modality using derived flows and vector fields on corresponding manifolds, our method excels in the fine-grained design of full-atom peptides. Harnessing the multi-modal paradigm, our approach adeptly tackles various tasks such as fix-backbone sequence design and side-chain packing through partial sampling. Through meticulously crafted experiments, we demonstrate that PepFlow exhibits superior performance in comprehensive benchmarks, highlighting its significant potential in computational peptide design and analysis.
Biomolecules,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
This paper focuses on the problem of de novo peptide design, particularly in the context of multi-modal flow matching on specific protein receptors. Existing deep generative models have made progress in protein backbone design, but designing peptide binders for specific targets still poses challenges as it requires considering side-chain dynamics and binding pockets. The paper proposes a multi-modal deep generative model called PepFlow, which utilizes a conditional flow matching framework to learn a probabilistic path from a prior distribution to a target distribution through continuous normalization flows. PepFlow is capable of simultaneously handling the amino acid sequence, side-chain angles, and rigid backbone structure of peptide chains, enabling tasks such as design with fixed backbone sequences and side-chain packing through partial sampling. The paper showcases the superior performance of PepFlow in comprehensive benchmark tests and highlights its potential in computational peptide design and analysis.