OOgenesis_Pred: A sequence-based method for predicting oogenesis proteins by six different modes of Chou's pseudo amino acid composition
Maryam Rahimi,Mohammad Reza Bakhtiarizadeh,Abdollah Mohammadi-Sangcheshmeh
DOI: https://doi.org/10.1016/j.jtbi.2016.11.028
IF: 2.405
2017-02-01
Journal of Theoretical Biology
Abstract:Regarding to critical roles of oogenesis in formation of ova or unfertilized eggs from the oogonia by mitotic division and subsequent differentiation, the identification of oogenesis-related proteins is of great interest. However, the experimental determination of proteins involved in oogenesis is expensive, time consuming and labor-intensive. Therefore, a new powerful discriminating model is indispensable for classifying oogenesis/non-oogenesis-related proteins with high accuracy and precision. Hereby, for the first time we developed a support vector machine based oogenesis protein prediction method which differentiates oogenesis from non-oogenesis proteins. By means of informative protein physicochemical properties and in addition parameter optimization scheme, our method yields a robust and consistent performance. Our model achieved 87.68% and 84.82% prediction accuracy by five-fold cross validation test for datasets with 90% and 50% identity, respectively. The prediction model was also assessed using the independent dataset and yielded 91.62% and 85.38% prediction accuracy for datasets with 90% and 50% identity, respectively, which further demonstrates the effectiveness of our method. Moreover, by applying 10 different feature weighting methods, the more important protein features for oogenesis/non-oogenesis-related proteins discrimination, including serine and glycine frequency, quasi-sequence-order, pseudo-amino acid composition, distribution and conjoint triad, were determined. The success rates revealed that our model can be considered as a new encouraging and strong model for predicting proteins involved in oogenesis with appropriate performance. To enhance the value of the practical applications of the proposed method, we developed a standalone software for predicting oogenesis candidate proteins called OOgenesis_Pred. This software is the first predictor ever established for identifying oogenesis proteins. We also showed the capability of OOgenesis_Pred by making oogenesis-related proteins prediction for some of the oogenesis candidate proteins. It is anticipated that OOgenesis_Pred will become a powerful tool for future proteomic studies related to oogenesis.
biology,mathematical & computational biology