Protein Folding Shape Code Prediction Based On Psi-Blast Profile Using Two-Stage Neural Network
Chong Yu,Jiaan Yang,Juexin Wang,Wei Du,Yan Wang,Yanchun Liang
DOI: https://doi.org/10.1007/978-3-642-31968-6_68
2012-01-01
Abstract:Protein Folding Shape Code (PFSC) is a symbolic definition of protein structure, which defines the details of protein structure between second structure and tertiary structure. In this article, we build a two-stage neural network model based on PSI-BLAST profile to predict Protein Folding Shape Code. First of all, we use PSI-BLAST to generate the position specific scoring matrices, and then use the slicing window to encode the PSI-BLAST profile information, which is the input of the whole module. The output is the existing PFSC code which presented by 27 orthogonal vectors. 128 unique protein folds were picked out for both testing and training. (No similar folds were presented in both the testing and training sets). Those folds were chosen by structural similarity criteria rather than similarity criteria of sequence. After evaluated by the three-fold cross-validation, our model can reach the accuracy about 65% while considering the top 3 predicted PFSCs. Although the results are not high enough for applications, the PFSC method could also provide a breakthrough of the tertiary structure prediction.
What problem does this paper attempt to address?