Protein Secondary Structure Co-Training Prediction Method

LIU Jun,XIONG Zhong-yang,WANG Yin-hui
DOI: https://doi.org/10.3969/j.issn.1001-3695.2011.05.027
2011-01-01
Abstract:Machine learning based protein secondary structure prediction methods suffered low prediction accuracy because they ignored the amino acid hydrophobic property and the interaction between far away amino acids.In order to solve this problem,comparative experiments had been done.A sequence of hydrophobic value could be build by replacing the amino acid by its hydrophobic value.Experiments show that the BP neural network using long amino hydrophobic value sequence works well in prediction of E structure which is controlled mainly by long amino acid-amino acid interaction.Because both the Profile space and the hydrophobic energy value space were sufficient and redundant views,this paper proposed a Co-training algorithm.In the proposed algorithm,there were two classifiers.One was SVM classifier trained in Profile space,and the other was BP neural network classifier trained in hydrophobic value space,and they predicted one amino acid secondary structure independently.If these two classifiers had different prediction results with one amino acid,an arbitration rule proposed was employed to make the final decision which was based on an active selecting strategy.Suspected sample and creditable sample were defined according to the characteristics of the classifiers and spaces to arbitrate the controversial prediction results.The experimental results show that the proposed algorithm has higher prediction accuracy both in E structure which controlled mainly by long interaction and H structure which controlled mainly by short interaction than existing algorithms.
What problem does this paper attempt to address?