UbiSites-SRF: Ubiquitination Sites Prediction Using Statistical Moment with Random Forest Approach

Shazia Murad,Arwa Mashat,Alia Mahfooz,Sher Afzal Khan,Omar Barukab
DOI: https://doi.org/10.21203/rs.3.rs-669582/v1
2021-07-09
Abstract:Abstract Ubiquitination is the process that supports the growth and development of eukaryotic and prokaryotic organisms. It is helpful in regulating numerous functions such as the cell division cycle, caspase-mediated cell death, maintenance of protein transcription, signal transduction, and restoration of DNA damage. Because of these properties, its identification is essential to understand its molecular mechanism. Some traditional methods such as mass spectrometry and site-directed mutagenesis are used for this purpose, but they are tedious and time consuming. In order to overcome such limitations, interest in computational models of this type of identification is therefore being developed. In this study, an accurate and efficient classification model for identifying ubiquitination sites was constructed. The proposed model uses statistical moments for feature extraction along with random forest for classification. Three sets of ubiquitination are used to train and test the model. The model is assessed through 10-fold cross-validation and jackknife tests. We achieved a 10-fold accuracy of 100% for dataset-1, 99.88% for dataset-2 and 99.84% for the dataset-3, while with Jackknife test we got 100% for the dataset-1, 99.91% for dataset-2 and 99.99%. for the dataset-3. The results obtained are almost the maximum, which is far better as compared to the pre-existing models available in the literature.
What problem does this paper attempt to address?