Abstract:Protein structure prediction can be used to determine the three-dimensional shape of a protein from its amino acid sequence<a href="/articles/s41586-019-1923-7#ref-CR1">1</a>. This problem is of fundamental importance as the structure of a protein largely determines its function<a href="/articles/s41586-019-1923-7#ref-CR2">2</a>; however, protein structures can be difficult to determine experimentally. Considerable progress has recently been made by leveraging genetic information. It is possible to infer which amino acid residues are in contact by analysing covariation in homologous sequences, which aids in the prediction of protein structures<a href="/articles/s41586-019-1923-7#ref-CR3">3</a>. Here we show that we can train a neural network to make accurate predictions of the distances between pairs of residues, which convey more information about the structure than contact predictions. Using this information, we construct a potential of mean force<a href="/articles/s41586-019-1923-7#ref-CR4">4</a> that can accurately describe the shape of a protein. We find that the resulting potential can be optimized by a simple gradient descent algorithm to generate structures without complex sampling procedures. The resulting system, named AlphaFold, achieves high accuracy, even for sequences with fewer homologous sequences. In the recent Critical Assessment of Protein Structure Prediction<a href="/articles/s41586-019-1923-7#ref-CR5">5</a> (CASP13)—a blind assessment of the state of the field—AlphaFold created high-accuracy structures (with template modelling (TM) scores<a href="/articles/s41586-019-1923-7#ref-CR6">6</a> of 0.7 or higher) for 24 out of 43 free modelling domains, whereas the next best method, which used sampling and contact information, achieved such accuracy for only 14 out of 43 domains. AlphaFold represents a considerable advance in protein-structure prediction. We expect this increased accuracy to enable insights into the function and malfunction of proteins, especially in cases for which no structures for homologous proteins have been experimentally determined<a href="/articles/s41586-019-1923-7#ref-CR7">7</a>.

AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models

AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models

AlphaFold Protein Structure Database in 2024: providing structure coverage for over 214 million protein sequences

Highly accurate protein structure prediction with AlphaFold

Highly accurate protein structure prediction for the human proteome

Accurate structure prediction of biomolecular interactions with AlphaFold 3

Highly significant improvement of protein sequence alignments with AlphaFold2

Improved protein structure prediction using potentials from deep learning

Protein structure predictions to atomic accuracy with AlphaFold

AlphaFold-latest: revolutionizing protein structure prediction for comprehensive biomolecular insights and therapeutic advancements

High-resolution de novo structure prediction from primary sequence

The power and pitfalls of AlphaFold2 for structure prediction beyond rigid globular proteins

Dissecting AlphaFolds Capabilities with Limited Sequence Information

AlphaFold2 transmembrane protein structure prediction shines

Exploring structural diversity across the protein universe with The Encyclopedia of Domains

Protein 3D Structure Identification by AlphaFold: a Physics-Based Prediction or Recognition Using Huge Databases?

Large protein databases reveal structural complementarity and functional locality

APACE: AlphaFold2 and advanced computing as a service for accelerated discovery in biophysics

High-throughput prediction of protein conformational distributions with subsampled AlphaFold2

SPEACH_AF: Sampling protein ensembles and conformational heterogeneity with Alphafold2