AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models

Mihaly Varadi,Stephen Anyango,Mandar Deshpande,Sreenath Nair,Cindy Natassia,Galabina Yordanova,David Yuan,Oana Stroe,Gemma Wood,Agata Laydon,Augustin Žídek,Tim Green,Kathryn Tunyasuvunakool,Stig Petersen,John Jumper,Ellen Clancy,Richard Green,Ankur Vora,Mira Lutfi,Michael Figurnov,Andrew Cowie,Nicole Hobbs,Pushmeet Kohli,Gerard Kleywegt,Ewan Birney,Demis Hassabis,Sameer Velankar,Mihaly Varadi,Stephen Anyango,Mandar Deshpande,Sreenath Nair,Cindy Natassia,Galabina Yordanova,David Yuan,Oana Stroe,Gemma Wood,Agata Laydon,Augustin Žídek,Tim Green,Kathryn Tunyasuvunakool,Stig Petersen,John Jumper,Ellen Clancy,Richard Green,Ankur Vora,Mira Lutfi,Michael Figurnov,Andrew Cowie,Nicole Hobbs,Pushmeet Kohli,Gerard Kleywegt,Ewan Birney,Demis Hassabis,Sameer Velankar
DOI: https://doi.org/10.1093/nar/gkab1061
IF: 14.9
2021-11-17
Nucleic Acids Research
Abstract:Abstract The AlphaFold Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) is an openly accessible, extensive database of high-accuracy protein-structure predictions. Powered by AlphaFold v2.0 of DeepMind, it has enabled an unprecedented expansion of the structural coverage of the known protein-sequence space. AlphaFold DB provides programmatic access to and interactive visualization of predicted atomic coordinates, per-residue and pairwise model-confidence estimates and predicted aligned errors. The initial release of AlphaFold DB contains over 360,000 predicted structures across 21 model-organism proteomes, which will soon be expanded to cover most of the (over 100 million) representative sequences from the UniRef90 data set.
biochemistry & molecular biology
What problem does this paper attempt to address?