A Perspective on the Prospective Use of AI in Protein Structure Prediction
Raphaelle Versini,Sujith Sritharan,Burcu Aykac Fas,Thibault Tubiana,Sana Zineb Aimeur,Julien Henri,Marie Erard,Oliver Nüsse,Jessica Andreani,Marc Baaden,Patrick Fuchs,Tatiana Galochkina,Alexios Chatzigoulas,Zoe Cournia,Hubert Santuz,Sophie Sacquin-Mora,Antoine Taly
DOI: https://doi.org/10.1021/acs.jcim.3c01361
2024-01-08
Abstract:AlphaFold2 (AF2) and RoseTTaFold (RF) have revolutionized structural biology, serving as highly reliable and effective methods for predicting protein structures. This article explores their impact and limitations, focusing on their integration into experimental pipelines and their application in diverse protein classes, including membrane proteins, intrinsically disordered proteins (IDPs), and oligomers. In experimental pipelines, AF2 models help X-ray crystallography in resolving the phase problem, while complementarity with mass spectrometry and NMR data enhances structure determination and protein flexibility prediction. Predicting the structure of membrane proteins remains challenging for both AF2 and RF due to difficulties in capturing conformational ensembles and interactions with the membrane. Improvements in incorporating membrane-specific features and predicting the structural effect of mutations are crucial. For intrinsically disordered proteins, AF2's confidence score (pLDDT) serves as a competitive disorder predictor, but integrative approaches including molecular dynamics (MD) simulations or hydrophobic cluster analyses are advocated for accurate dynamics representation. AF2 and RF show promising results for oligomeric models, outperforming traditional docking methods, with AlphaFold-Multimer showing improved performance. However, some caveats remain in particular for membrane proteins. Real-life examples demonstrate AF2's predictive capabilities in unknown protein structures, but models should be evaluated for their agreement with experimental data. Furthermore, AF2 models can be used complementarily with MD simulations. In this Perspective, we propose a "wish list" for improving deep-learning-based protein folding prediction models, including using experimental data as constraints and modifying models with binding partners or post-translational modifications. Additionally, a meta-tool for ranking and suggesting composite models is suggested, driving future advancements in this rapidly evolving field.