Mixture of Experts Enable Efficient and Effective Protein Understanding and Design

Ning Sun,Shuxian Zou,Tianhua Tao,Sazan Mahbub,Dian Li,Yonghao Zhuang,Hongyi Wang,Xingyi Cheng,Le Song,Eric P. Xing
DOI: https://doi.org/10.1101/2024.11.29.625425
2024-12-03
Abstract:Proteins play a fundamental role in life. Understanding the language of proteins offers significant potential for gaining mechanistic insights into biological systems and introduces new avenues for treating diseases, enhancing agriculture, and safeguarding the environment. While large protein language models (PLMs) like ESM2-15B and xTrimoPGLM-100B have achieved remarkable performance in diverse protein understanding and design tasks, these models, being dense transformer models, pose challenges due to their computational inefficiency during training and deployment. In this work, we introduce AIDO.Protein, a pretrained module for protein representation in an AI-driven Digital Organism. AIDO.Protein is also the first mixture-of-experts (MoE) model in the protein domain, with model size scales to 16 billion parameters. Leveraging a sparse MoE architecture with 8 experts within each transformer block and selectively activating 2 experts for each input token, our model is significantly more efficient in training and inference. Through pre-training on 1.2 trillion amino acids collected from UniRef90 and ColabfoldDB, our model achieves state-of-the-art results across most tasks in the xTrimoPGLM benchmark. Furthermore, on over 280 ProteinGym Deep Mutational Scanning (DMS) assays, our model achieves nearly 99% of the overall performance of the best MSA-based model and significantly outperforms the previously reported state-of-the-art models that do not utilize MSA. We also adapted this model for structure-conditioned protein sequence generation tasks and achieved new SOTA in this domain. These results indicate that AIDO.Protein can serve as a strong foundation model for protein understanding and design. Models and codes are available through ModelGenerator in https://github.com/genbio-ai/AIDO and on Hugging Face.
Bioinformatics
What problem does this paper attempt to address?