Abstract:Exponential growth in the number of available protein sequences is unmatched by the slower growth in the number of structures. As a result, the development of efficient and fast protein secondary structure prediction methods is essential for the broad comprehension of protein structures. Computational methods that can efficiently determine secondary structure can in turn facilitate protein tertiary structure prediction, since most methods rely initially on secondary structure predictions. Recently, we have developed a fast learning optimized prediction methodology (FLOPRED) for predicting protein secondary structure (Saraswathi et al. in JMM 18:4275, 2012). Data are generated by using knowledge-based potentials combined with structure information from the CATH database. A neural network-based extreme learning machine (ELM) and advanced particle swarm optimization (PSO) are used with this data to obtain better and faster convergence to more accurate secondary structure predicted results. A five-fold cross-validated testing accuracy of 83.8 % and a segment overlap (SOV) score of 78.3 % are obtained in this study. Secondary structure predictions and their accuracy are usually presented for three secondary structure elements: α-helix, β-strand and coil but rarely have the results been analyzed with respect to their constituent amino acids. In this paper, we use the results obtained with FLOPRED to provide detailed behaviors for different amino acid types in the secondary structure prediction. We investigate the influence of the composition, physico-chemical properties and position specific occurrence preferences of amino acids within secondary structure elements. In addition, we identify the correlation between these properties and prediction accuracy. The present detailed results suggest several important ways that secondary structure predictions can be improved in the future that might lead to improved protein design and engineering.

Protein Secondary Structure Prediction: A Review of Progress and Directions

Recent Progress of Protein Tertiary Structure Prediction

Sixty-five years of the long march in protein secondary structure prediction: the final stretch?

Protein secondary structure prediction: A survey of the state of the art

Protein folding in the modern era: a pedestrian's guide

A Systematic Review on Protein Structure Prediction – Conventional and AI Methods

Recent Progress in Machine Learning-Based Methods for Protein Fold Recognition

Advances in protein structure prediction and design

Deep learning for protein secondary structure prediction: Pre and post-AlphaFold

Recent Advances and Challenges in Protein Structure Prediction

Protein Structure Prediction: Challenges, Advances, and the Shift of Research Paradigms

Recent Advances in Computational Prediction of Secondary and Supersecondary Structures from Protein Sequences

A Review of Protein Structure Prediction using Deep Learning

Protein Structure Prediction: Conventional and Deep Learning Perspectives

AI-Driven Deep Learning Techniques in Protein Structure Prediction

Distributions of amino acids suggest that certain residue types more effectively determine protein secondary structure

Multiple Linear Regression for Protein Secondary Structure Prediction.

Amino acid torsion angles enable prediction of protein fold classification

An Efficient Method for Protein Secondary Structure Prediction

Protein domain identification methods and online resources

Deeper Profiles and Cascaded Recurrent and Convolutional Neural Networks for state-of-the-art Protein Secondary Structure Prediction