Abstract:Three-dimensional (3D) protein structures reveal the fundamental information about protein function. The state-of-art protein structure prediction methods such as Alphafold are being widely used to predict structures of uncharacterized proteins in biomedical research. There is a significant need to further improve the quality and nativeness of the predicted structures to enhance their usability. Current machine learning methods of refining protein structures focus mostly on improving the backbone quality of predicted structures without effectively leveraging and enhancing the conformation of all atoms including sidechains, while molecular simulation methods are computationally intensive and time-consuming. In this work, we develop ATOMRefine, a deep learning-based, end-to-end, all-atom protein structural model refinement method. It uses a SE(3)-equivariant graph transformer network that is equivariant to the rotation and translation of 3D structures in conjunction with a novel graph representation of all atoms to directly refine protein atomic coordinates of all the atoms in a predicted tertiary structure represented as a molecular graph. The method is first trained and tested on the structural models in AlphafoldDB whose experimental structures are known, and then blindly tested on 69 CASP14 regular targets and 7 CASP14 refinement targets. ATOMRefine improves the quality of both backbone atoms and all-atom conformation of the initial structural models generated by AlphaFold. It also performs better than the state-of-the art refinement methods in multiple evaluation metrics including an all-atom model quality score, the MolProbity score based on the analysis of all-atom contacts, bond length, atom clashes, torsion angles, and sidechain rotamers. As ATOMRefine can refine a protein structure quickly, it provides a viable, fast solution for improving protein geometry and fixing structural errors of predicted structures through direct coordinate refinement.

Refinement of AlphaFold-Multimer structures with single sequence input

Evaluation of AlphaFold-Multimer prediction on multi-chain protein complexes

Improved multimer prediction using massive sampling with AlphaFold in CASP15

Improved protein complex prediction with AlphaFold-multimer by denoising the MSA profile

AFsample: improving multimer prediction with AlphaFold using massive sampling

Improved protein structure prediction with trRosettaX2, AlphaFold2, and optimized MSAs in CASP15

Cutaneous alternariosis in a cardiac transplant recipient

AlphaFold-Multimer accurately captures interactions and dynamics of intrinsically disordered protein regions

AFsample: Improving Multimer Prediction with AlphaFold using Aggressive Sampling

Improving AlphaFold2-based protein tertiary structure prediction with MULTICOM in CASP15

Benchmarking Refined and Unrefined AlphaFold2 Structures for Hit Discovery

Unmasking AlphaFold to integrate experiments and predictions in multimeric complexes

Integrating deep learning, threading alignments, and a multi‐MSA strategy for high‐quality protein monomer and complex structure prediction in CASP15

Improved the heterodimer protein complex prediction with protein language models

Benchmarking AlphaFold for protein complex modeling reveals accuracy determinants

Unmasking AlphaFold: integration of experiments and predictions in multimeric complexes

Accurate structure prediction of biomolecular interactions with AlphaFold 3

MHC-Fine: Fine-tuned AlphaFold for precise MHC-peptide complex prediction

Towards a greener AlphaFold2 protocol for protein complex modeling: Insights from CAPRI Round 55

Enhancing cryo-EM structure prediction with DeepTracer and AlphaFold2 integration

Atomic protein structure refinement using all-atom graph representations and SE(3)-equivariant graph neural networks