Predicting Circrna-Disease Associations Using Similarity Assessing Graph Convolution from Multi-Source Information Networks.
Yang Li,Xue-Gang Hu,Pei-Pei Li,Lei Wang,Zhu-Hong You
DOI: https://doi.org/10.1109/bibm55620.2022.9995674
2022-01-01
Abstract:Circular RNA (circRNA), a novel endogenous noncoding RNA molecule with a closed-loop structure, can be used as a biomarker for many complex human diseases. Determining the relationship between circRNAs and diseases helps us to understand the diagnosis, treatment, and pathogenesis of complex diseases, which plays a critical role in clinical research. Nevertheless, the discovery of new circRNA-disease associations by wet-lab methods is not only time-consuming and costly but also randomized and blinded, which is also limited to small-scale studies. Thus, there is an urgent need to establish efficient and reliable computational methods to infer potential circRNA-disease associations on a large scale to effectively reduce costs and save time, and avoid high false-positive rates. In this paper, we propose a novel computational method for predicting circRNA-disease association based on the Similarity Assessing Graph Convolution Network (SAGCN) algorithm, which combines the multi-source similarity network constructed by circRNA and disease. Firstly, we fuse the multi-source similarity information of circRNAs and diseases and construct the multi-source similarity network respectively. Then we use the SAGCN algorithm to extract the hidden feature representations of circRNAs and diseases efficiently and objectively in the way of measuring the similarity between different nodes in the network. Finally, the obtained high-level features of circRNAs and diseases are fed to the multilayer perceptron (MLP) classifier for accurate prediction. Using the 5-fold cross-validation method, the AUC scores of the four SAGCN algorithms, on the benchmark circR2Disease dataset are 93.30%, 92.98%, 92.22% and 91.94%, respectively. Furthermore, case studies further validated that the proposed model was supported by biological experiments, and 25 of the top 30 circRNA-disease associations with the highest scores were confirmed by recent literature. Based on these reliable results, it can be anticipated that the proposed model can be used as an effective computational tool to predict circRNA-disease associations and can provide the most promising candidates for biological experiments.