LMTRDA: Using logistic model tree to predict MiRNA-disease associations by fusing multi-source information of sequences and similarities.
Lei Wang,Zhu-Hong You,Xing Chen,Yang-Ming Li,Ya-Nan Dong,Li-Ping Li,Kai Zheng
DOI: https://doi.org/10.1371/journal.pcbi.1006865
2019-01-01
PLoS Computational Biology
Abstract:Emerging evidence has shown microRNAs (miRNAs) play an important role in human disease research. Identifying potential association among them is significant for the development of pathology, diagnose and therapy. However, only a tiny portion of all miRNA-disease pairs in the current datasets are experimentally validated. This prompts the development of high-precision computational methods to predict real interaction pairs. In this paper, we propose a new model of Logistic Model Tree for predicting miRNA-Disease Association (LMTRDA) by fusing multi-source information including miRNA sequences, miRNA functional similarity, disease semantic similarity, and known miRNA-disease associations. In particular, we introduce miRNA sequence information and extract its features using natural language processing technique for the first time in the miRNA-disease prediction model. In the cross-validation experiment, LMTRDA obtained 90.51% prediction accuracy with 92.55% sensitivity at the AUC of 90.54% on the HMDD V3.0 dataset. To further evaluate the performance of LMTRDA, we compared it with different classifier and feature descriptor models. In addition, we also validate the predictive ability of LMTRDA in human diseases including Breast Neoplasms, Breast Neoplasms and Lymphoma. As a result, 28, 27 and 26 out of the top 30 miRNAs associated with these diseases were verified by experiments in different kinds of case studies. These experimental results demonstrate that LMTRDA is a reliable model for predicting the association among miRNAs and diseases. Author summary Identification of miRNA-disease associations is considered as an important step for the development of diagnose and therapy. Computational methods contribute to discovering the potential disease-related miRNAs. Based on the assumption that functionally related miRNAs tend to be involved disease, the model of LMTRDA is proposed to prioritize the underlying miRNA-disease associations by fusing multi-source information including miRNA sequences, miRNA functional similarity, disease semantic similarity, and known miRNA-disease associations. Through cross validation, the promising results demonstrated the effectiveness of the proposed model. We further implemented the case studies of three important human complex diseases including Breast Neoplasms, Breast Neoplasms and Lymphoma, 28, 27 and 26 of top-30 predicted miRNA-disease associations have been manually confirmed based on recent experimental reports. It is anticipated that LMTRDA model could prioritize the most potential miRNA-disease associations on a large scale for advancing the progress of biological experiment validation in the future, which could further contribute to the understanding of complex disease mechanisms.