PDB_Amyloid: The Extended Live Amyloid Structure List from the PDB

Kristof Takacs,Balint Varga,Vince Grolmusz
DOI: https://doi.org/10.48550/arXiv.1805.09758
IF: 6.064
2018-05-24
Biomolecules
Abstract:The Protein Data Bank (PDB) contains more than 135 000 entries today. From these, relatively few amyloid structures can be identified, since amyloids are insoluble in water. Therefore, mostly solid state NMR-recorded amyloid structures are deposited in the PDB. Based on the geometric analysis of these deposited structures we have prepared an automatically updated webserver, which generates the list of the deposited amyloid structures, and, additionally, those globular protein entries, which have amyloid-like substructures of a given size and characteristics. We have found that applying only the properly chosen geometric conditions, it is possible to identify the deposited amyloid structures, and a number of globular proteins with amyloid-like substructures. We have analyzed these globular proteins and have found that many of them are known to form amyloids more easily than many other globular proteins. Our results relate to the method of (Stankovic, I. et al. (2017): Construction of Amyloid PDB Files Database. Transactions on Internet Research. 13 (1): 47-51), who have applied a hybrid textual-search and geometric approach for finding amyloids in the PDB. If one intends to identify a subset of the PDB for some applications, the identification algorithm needs to be re-run periodically, since in 2017, on average, every day 30 new entries were deposited in the data bank. Our webserver is updated regularly and automatically, and the identified amyloid- and partial amyloid structures can be viewed or their list can be downloaded from the site https://pitgroup.org/amyloid.
What problem does this paper attempt to address?