Taxonomy-specific Assessment of Intrinsic Disorder Predictions at Residue and Region Levels in Higher Eukaryotes, Protists, Archaea, Bacteria and Viruses

Sushmita Basu,Lukasz Kurgan
DOI: https://doi.org/10.1016/j.csbj.2024.04.059
IF: 6.155
2024-04-30
Computational and Structural Biotechnology Journal
Abstract:Intrinsic disorder predictors were evaluated in several studies including the two large CAID experiments. However, these studies are biased towards eukaryotic proteins and focus primarily on the residue-level predictions. We provide first-of-its-kind assessment that comprehensively covers the taxonomy and evaluates predictions at the residue and disordered region levels. We curate a benchmark dataset that uniformly covers eukaryotic, archaeal, bacterial, and viral proteins. We find that predictive performance differs substantially across taxonomy, where viruses are predicted most accurately, followed by protists and higher eukaryotes, while bacterial and archaeal proteins suffer lower levels of accuracy. These trends are consistent across predictors. We also find that current tools, except for flDPnn, struggle with reproducing native distributions of the numbers and sizes of the disordered regions. Moreover, analysis of two variants of disorder predictions derived from the AlphaFold2 predicted structures reveals that they produce accurate residue-level propensities for archaea, bacteria and protists. However, they underperform for higher eukaryotes and generally struggle to accurately identify disordered regions. Our results motivate development of new predictors that target bacteria and archaea and which produce accurate results at both residue and region levels. We also stress the need to include the region-level assessments in future assessments. Graphical abstract Download : Download high-res image (297KB) Download : Download full-size image
biochemistry & molecular biology
What problem does this paper attempt to address?