Novel Human Mirna-Disease Association Inference Based on Random Forest
Xing Chen,Chun-Chun Wang,Jun Yin,Zhu-Hong You
DOI: https://doi.org/10.1016/j.omtn.2018.10.005
IF: 10.183
2018-01-01
Molecular Therapy — Nucleic Acids
Abstract:Since the first microRNA (miRNA) was discovered, a lot of studies have confirmed the associations between miRNAs and human complex diseases. Besides, obtaining and taking advantage of association information between miRNAs and diseases play an increasingly important role in improving the treatment level for complex diseases. However, due to the high cost of traditional experimental methods, many researchers have proposed different computational methods to predict potential associations between miRNAs and diseases. In this work, we developed a computational model of Random Forest for miRNA-disease association (RFMDA) prediction based on machine learning. The training sample set for RFMDA was constructed according to the human microRNA disease database (HMDD) version (v.)2.0, and the feature vectors to represent miRNA-disease samples were defined by integrating miRNA functional similarity, disease semantic similarity, and Gaussian interaction profile kernel similarity. The Random Forest algorithm was first employed to infer miRNA-disease associations. In addition, a filter-based method was implemented to select robust features from the miRNA-disease feature set, which could efficiently distinguish related miRNA-disease pairs from unrelated miRNA-disease pairs. RFMDA achieved areas under the curve (AUCs) of 0.8891, 0.8323, and 0.8818 +/- 0.0014 under global leave-one-out cross-validation, local leave-one-out cross-validation, and 5-fold cross-validation, respectively, which were higher than many previous computational models. To further evaluate the accuracy of RFMDA, we carried out three types of case studies for four human complex diseases. As a result, 43 (esophageal neoplasms), 46 (lymphoma), 47 (lung neoplasms), and 48 (breast neoplasms) of the top 50 predicted disease-related miRNAs were verified by experiments in different kinds of case studies. The results of cross-validation and case studies indicated that RFMDA is a reliable model for predicting miRNA-disease associations.