AFM-RL: Large Protein Complex Docking Using AlphaFold-Multimer and Reinforcement Learning

Tunde Aderinwale,Rashidedin Jahandideh,Zicong Zhang,Bowen Zhao,Yi Xiong,Daisuke Kihara
DOI: https://doi.org/10.1101/2024.01.20.576386
2024-01-23
Abstract:Various biological processes in living cells are carried out by protein complexes, whose interactions can span across multiple protein structures. To understand the molecular mechanisms of such processes, it is crucial to know the quaternary structures of these complexes. Although the structures of many protein complexes have been determined through biophysical experiments, there are still many important complex structures that are yet to be determined, particularly for large complexes with multiple chains. To supplement experimental structure determination, many computational protein docking methods have been developed, but most are limited to two chains, and few are designed for three chains or more. We have previously developed a method, RL-MLZerD, for multiple protein docking, which was applied to complexes with three to five chains. Here, we expand the ability of this method to predict the structures of large protein complexes with six to twenty chains. We use AlphaFold-Multimer (AFM) to predict pairwise models and then assemble them using our reinforcement learning framework. Our new method, AFM-RL, can predict a diverse set of pairwise models, which aids the RL assembly steps for large protein complexes. Additionally, AFM-RL demonstrates improved modeling performance when compared to existing methods for large protein complex docking.
Bioinformatics
What problem does this paper attempt to address?
The paper focuses on the problem of structure prediction for large protein complexes. Currently, although there are many methods available for predicting protein interactions, most of them are limited to protein docking between two chains, and predicting large complexes with three or more chains remains a challenge. The research team previously developed a method called RL-MLZerD to handle complexes with three to five chains. In this paper, they propose a new method called AFM-RL, which combines AlphaFold-Multimer (AFM) and reinforcement learning framework, to predict the structure of large protein complexes containing six to twenty chains. AFM is used to generate pairing models, which are then assembled using reinforcement learning algorithm. AFM-RL is able to generate diverse pairing models, which helps address the interface loss and memory limitation issues in assembling large complexes. The study shows that compared to existing methods, AFM-RL performs better in predicting the structure of large protein complexes, with an average RMSD of 10.09 Å and TM-SCORE of 0.81. Compared to the MoLPC method, AFM-RL has lower RMSD, indicating higher accuracy in assembling complex structures. Furthermore, the paper demonstrates the ability of AFM-RL to handle the complexity of atypical conformations and fully assemble complex structures without chain clashes or missing crucial interaction interfaces. These results emphasize the advantages of AFM-RL in predicting the structure of large protein complexes, contributing to better understanding of the biological molecular activities and interactions in cellular processes.