Enhancing AlphaFold-Multimer-based Protein Complex Structure Prediction with MULTICOM in CASP15

Jian Liu,Zhiye Guo,Tianqi Wu,Raj S Roy,Farhan Quadir,Chen Chen,Jianlin Cheng
DOI: https://doi.org/10.1101/2023.05.16.541055
2023-05-18
bioRxiv
Abstract:AlphaFold-Multimer has emerged as the state-of-the-art tool for predicting the quaternary structure of protein complexes (assemblies or multimers) since its release in 2021. To further enhance the AlphaFold-Multimer-based complex structure prediction, we developed a new quaternary structure prediction system (MULTICOM) to improve the input fed to AlphaFold-Multimer and evaluate and refine the outputs generated by AlphaFold2-Multimer. Specifically, MULTICOM samples diverse multiple sequence alignments (MSAs) and templates for AlphaFold-Multimer to generate structural models by using both traditional sequence alignments and new Foldseek-based structure alignments, ranks structural models through multiple complementary metrics, and refines the structural models via a Foldseek structure alignment-based refinement method. The MULTICOM system with different implementations was blindly tested in the assembly structure prediction in the 15th Critical Assessment of Techniques for Protein Structure Prediction (CASP15) in 2022 as both server and human predictors. Our server (MULTICOM_qa) ranked 3rd among 26 CASP15 server predictors and our human predictor (MULTICOM_human) ranked 7th among 87 CASP15 server and human predictors. The average TM-score of the first models predicted by MULTICOM_qa for CASP15 assembly targets is ~0.76, 5.3% higher than ~0.72 of the standard AlphaFold-Multimer. The average TM-score of the best of top 5 models predicted by MULTICOM_qa is ~0.80, about 8% higher than ~0.74 of the standard AlphaFold-Multimer. Moreover, the novel Foldseek Structure Alignment-based Model Generation (FSAMG) method based on AlphaFold-Multimer outperforms the widely used sequence alignment-based model generation. The source code of MULTICOM is available at: https://github.com/BioinfoMachineLearning/MULTICOM3.
What problem does this paper attempt to address?