ProteiNexus: Illuminating Protein Pathways through Structural Pre-training

Jiashan Li, Xi Chen, He Huang, Mingliang Zeng, Jingcheng Yu, Xinqi Gong, Qiwei Ye
Abstract:Protein representation learning has emerged as a powerful tool for various biological tasks. Language models derived from protein sequences represent the predominant trend in many current approaches. However, recent advances reveal that protein sequences alone cannot fully encapsulate the abundant information contained within protein structures, critical for understanding protein function and aiding innovative protein design. In this study, we present ProteiNexus, an innovative approach, effectively integrating protein structure learning with numerous downstream tasks. We propose a structural encoding mechanism adept at capturing fine-grained distance details and spatial positioning. By implementing a robust pre-training strategy and fine-tuning with lightweight decoders designed for specific downstream tasks, our model exhibits outstanding performance, establishing new benchmarks across a range of tasks. The code and models could be found at github repos.
What problem does this paper attempt to address?