Predicting Human Disease-Associated piRNAs Based on Multi-source Information and Random Forest.

Kai Zheng,Zhu-Hong You,Lei Wang,Hao-Yuan Li,Bo-Ya Ji
DOI: https://doi.org/10.1007/978-3-030-60802-6_20
2020-01-01
Abstract:Whole genome analysis studies have shown that Piwi-interacting RNA (piRNA) play a crucial role in disease progression, diagnosis, and therapeutic target. However, traditional biological experiments are expensive and time-consuming. Thus, computational models could serve as a complementary means to provide potential disease-related piRNA candidates. In this study, we propose a novel computational model called APDA to identify piRNA-disease associations. The proposed method integrates disease semantic similarity and piRNA sequence information to construct feature vectors, and maps them to the optimal feature subspace through the stacked autoencoder to obtain the final feature vector. Finally, random forest classifier is used to infer disease-related piRNA. In five-fold cross-validation, the APDA achieved an average AUC of 0.9088 and standard deviation of 0.0126, which is significantly better than the compared method. Therefore, the proposed APDA method is a powerful and necessary tool for predicting human disease-associated piRNAs and provide new impetus to reveal the underlying causes of human disease.
What problem does this paper attempt to address?