Abstract:Prediction of transmembrane (TM) proteins from their sequence facilitates functional study of genomes and the search of potential membrane-associated therapeutic targets. Computational methods for predicting TM sequences have been developed. These methods achieve high prediction accuracy for many TM proteins but some of these methods are less effective for specific class of TM proteins. Moreover, their performance has been tested by using a relatively small set of TM and non-membrane (NM) proteins. Thus it is useful to evaluate TM protein prediction methods by using a more diverse set of proteins and by testing their performance on specific classes of TM proteins. This work extensively evaluated the capability of support vector machine (SVM) classification systems for the prediction of TM proteins and those of several TM classes. These SVM systems were trained and tested by using 14962 TM and 12168 NM proteins from Pfam protein families. An independent set of 3389 TM and 6063 NM proteins from curated Pfam families were used to further evaluate the performance of these SVM systems. 90.1% and 86.7% of TM and NM proteins were correctly predicted respectively, which are comparable to those from other studies. The prediction accuracies for proteins of specific TM classes are 95.6%, 90.0%, 92.7% and 73.9% for G-protein coupled receptors, envelope proteins, outer membrane proteins, and transporters/channels respectively; and 98.1%, 99.5%, 86.4%, and 98.6% for non-G-protein coupled receptors, non-envelope proteins, non-outer membrane proteins, and non-transporters/non-channels respectively. Tested by using a significantly larger number and more diverse range of proteins than in previous studies, SVM systems appear to be capable of prediction of TM proteins and proteins of specific TM classes at accuracies comparable to those from previous studies. Our SVM systems – SVMProt, can be accessed at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi.

Prediction of transmembrane segments based on fuzzy cluster analysis of amino acids

Prediction of Functional Class of Proteins and Peptides Irrespective of Sequence Homology by Support Vector Machines.

Fuzzy prediction of transmembrane protein topology

Efficient and accurate prediction of transmembrane topology from amino acid sequence only

Prediction of membrane protein types in a hybrid space.

[A Novel Segment-Training Algorithm for Transmembrane Helices Prediction].

An Improved Algorithm For Transmembrane Protein Prediction

Domain Position Prediction Based on Sequence Information by Using Fuzzy Mean Operator

TM Finder: A prediction program for transmembrane protein segments using a combination of hydrophobicity and nonpolar phase helicity scales

A New Hybrid Approach to Predict Subcellular Localization by Incorporating Protein Evolutionary Conservation Information

The Combination Prediction of Transmembrane Regions Based on Dempster-Shafer Theory of Evidence

Using Pseudo Amino Acid Composition to Predict Transmembrane Regions in Protein: Cellular Automata and Lempel-Ziv Complexity

A novel method for predicting transmembrane segments in proteins based on a statistical analysis of the SwissProt database: the PRED-TMR algorithm

Prediction Enhancement of Residue Real-Value Relative Accessible Surface Area in Transmembrane Helical Proteins by Solving the Output Preference Problem of Machine Learning-Based Predictors.

PredβTM: A Novel β-Transmembrane Region Prediction Algorithm

Fuzzy KNN for Predicting Membrane Protein Types from Pseudo-Amino Acid Composition

Prediction of Seven Protein Structural Classes by Fusing Multi-Feature Information Including Protein Evolutionary Conservation Information

Using supervised fuzzy clustering to predict protein structural classes.

A Sequence-Based Computational Model for the Prediction of the Solvent Accessible Surface Area for Α-Helix and Β-Barrel Transmembrane Residues.

Enhance the Recognition of Signal Peptide Domain and Transmembrane Domain Based on Transformer

Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach