SPRINT-Gly: predicting N- and O-linked glycosylation sites of human and mouse proteins by using sequence and predicted structural properties

Ghazaleh Taherzadeh,Abdollah Dehzangi,Maryam Golchin,Yaoqi Zhou,Matthew P Campbell
DOI: https://doi.org/10.1093/bioinformatics/btz215
IF: 5.8
2019-03-23
Bioinformatics
Abstract:Abstract Motivation Protein glycosylation is one of the most abundant post-translational modifications that plays an important role in immune responses, intercellular signaling, inflammation and host-pathogen interactions. However, due to the poor ionization efficiency and microheterogeneity of glycopeptides identifying glycosylation sites is a challenging task, and there is a demand for computational methods. Here, we constructed the largest dataset of human and mouse glycosylation sites to train deep learning neural networks and support vector machine classifiers to predict N-/O-linked glycosylation sites, respectively. Results The method, called SPRINT-Gly, achieved consistent results between ten-fold cross validation and independent test for predicting human and mouse glycosylation sites. For N-glycosylation, a mouse-trained model performs equally well in human glycoproteins and vice versa, however, due to significant differences in O-linked sites separate models were generated. Overall, SPRINT-Gly is 18% and 50% higher in Matthews correlation coefficient than the next best method compared in N-linked and O-linked sites, respectively. This improved performance is due to the inclusion of novel structure and sequence-based features. Availability and implementation http://sparks-lab.org/server/SPRINT-Gly/ Supplementary information Supplementary data are available at Bioinformatics online.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?