Improving prediction accuracy for protein structure classification by neural network using feature combination

Ken-Li Lin,Chun Yuan Lin,Chuen-Der Huang,Hsiu-Ming Chang,Chiao Yun Yang,Chin-Teng Lin,Chuan Yi Tang,D. Frank Hsu
2005-01-01
Abstract:The classification of protein structures is essential for their function determination in bioinformatics. At present time, a reasonably high rate of prediction accuracy has been achieved in classifying proteins into four classes in the SCOP. However, it is still a challenge for classifying proteins into fine-grained folding categories, especially when the number of possible folding patterns as those defined in the SCOP is large. In our previous work, we have proposed a hierarchical learning architecture (HLA), two indirect coding features, and a gate function to differentiate proteins according to their classes and folding patterns. Our prediction accuracy rate for 27 folding categories was 65.5% compared favorably to previous results by Ding and Dubchak with 56.5% prediction accuracy rate. The success of the protein structure classification depends on two factors: the computational methods used and the features selected. In this paper, we use a combinatorial fusion analysis technique to facilitate feature selection and combination for improving predictive accuracy in protein structure classification. When applying the combinatorial fusion to our previous work, the resulting classification has an overall prediction accuracy rate of 87.8% for four classes and 70.9% for 27 folding categories. These rates are significantly higher than our previous work and demonstrate that combinatorial fusion is a valuable method for protein structure classification.
What problem does this paper attempt to address?