DNDesign: Enhancing Physical Understanding of Protein Inverse Folding Model via Denoising

Youhan Lee,Jaehoon Kim
DOI: https://doi.org/10.1101/2023.12.05.570298
2024-02-15
Abstract:Based on the central dogma that protein structure determines its functionality, an important approach for protein sequence design is to identify promising sequences that fold into pre-designed structures based on domain knowledge. Numerous studies have introduced deep generative model-based inverse-folding, which utilizes various generative models to translate fixed backbones to corresponding sequences. In this work, we reveal that denoising training enables models to deeply capture the protein energy landscape, which previous models do not fully leverage. Based on this, we propose a novel Denoising-enhanced protein fixed backbone design (DNDesign), which combines conventional inverse-folding networks with a novel plug-in module, which learns the physical understanding via denoising training and transfers the knowledge to the entire network. Through extensive experiments, we demonstrate that DNDesign can easily be integrated into state-of-the-art models and improve performance in multiple modes, including auto-regressive, non-auto-regressive, and scaled-up scenarios. Furthermore, we introduce a fixed backbone conservation analysis based on potential energy changes, which confirms that DNDesign ensures more energetically favorable inverse-folding.
Biochemistry
What problem does this paper attempt to address?