HGECDA: A Heterogeneous Graph Embedding Model for CircRNA-Disease Association Prediction

Yao Fu,Runtao Yang,Lina Zhang,Xu Fu
DOI: https://doi.org/10.1109/JBHI.2023.3299042
Abstract:Circular RNAs (circRNAs) are specifically and abnormally expressed in disease tissues, and thus can be used as biomarkers to diagnose relevant diseases. Predicting circRNA-disease associations will provide essential clues to reveal molecular mechanisms of disease development and discover novel therapeutic targets. Existing algorithms ignore the heterogeneous biological association information related to microRNAs (miRNAs). Based on a heterogeneous graph embedding model, a novel circRNA-disease association prediction method called HGECDA is developed in this paper. The heterogeneous graph network containing circRNA-miRNA-disease association information is first constructed. To sample the heterogeneous information, the meta-path-based random walk that can capture the relevance between various types of nodes is employed. Then, the path embedding model based on skip-gram and random negative sampling is built to acquire the initial feature vectors of circRNAs and diseases. Finally, the CosMulformer model with linearized self-attention and Hadamard product is designed to obtain the circRNA-disease interaction vectors and conduct the prediction task. Experimental results demonstrate the critical role of miRNA in enriching the information of the feature space, the effectiveness of the CosMulformer model in picking out deep local interaction features, and the feasibility of the Hadamard product chosen as the integration pattern in the CosMulformer model. Compared with existing state-of-the-art methods on the same dataset, HGECDA performs better than the other seven algorithms. Moreover, the case studies about breast cancer and colorectal cancer demonstrate the practical value of HGECDA in predicting potential circRNA-disease associations.
What problem does this paper attempt to address?