Scaffold-Lab: Critical Evaluation and Ranking of Protein Backbone Generation Methods in A Unified Framework

Zhuoqi Zheng,Bo Zhang,Bozitao Zhong,Kexin Liu,Zhengxin Li,Junjie Zhu,Jinyu Yu,Ting Wei,Hai-Feng Chen
DOI: https://doi.org/10.1101/2024.02.10.579743
2024-05-10
Abstract:De novo protein design has undergone a rapid development in recent years, especially for backbone generation, which stands out as more challenging yet valuable, offering the ability to design novel protein folds with fewer constraints. However, a comprehensive delineation of its potential for practical application in protein engineering remains lacking, as does a standardized evaluation framework to accurately assess the diverse methodologies within this field. Here, we proposed Scaffold-Lab benchmark focusing on evaluating unconditional generation across metrics like designability, novelty, diversity, efficiency and structural properties. We also extrapolated our benchmark to include the motif-scaffolding problem, demonstrating the utility of these conditional generation models. Our findings reveal that FrameFlow and RFdiffusion in unconditional generation along with Rfdiffusion and GPDL in conditional generation showcased the most outstanding performances. Furthermore, we described a systematic study to investigate conditional generation and applied it to the motif-scaffolding task, offering a novel perspective for the analysis and development of conditional protein design methods. All data and scripts will be available at https://github.com/Immortals-33/Scaffold-Lab.
Bioinformatics
What problem does this paper attempt to address?
This paper focuses on the evaluation and ranking of protein backbone generation methods. The author proposes a unified framework called Scaffold-Lab to comprehensively evaluate the performance of protein backbone generation techniques in terms of novelty, diversity, efficiency, and structural characteristics. Currently, most methods focus on generating backbones, but lack standardized evaluation systems, especially for protein engineering applications. In this paper, the author selected seven representative methods for testing and introduced two tasks: unconditional generation and conditional generation, with a special focus on motif-scaffolding problems. Through a comprehensive evaluation of these methods, the author found that FrameFlow and RFdiffusion performed the best in unconditional generation, while Rfdiffusion and GPDL excelled in conditional generation. Additionally, the study analyzed structural attributes to gain a deeper understanding of the progress and limitations of current methods, aiming to promote future development in the field of protein design.