ICDM-GEHC: identifying cancer driver module based on graph embedding and hierarchical clustering
Shiyu Deng,Jingli Wu,Gaoshi Li,Jiafei Liu,Yumeng Zhao
DOI: https://doi.org/10.1007/s40747-023-01328-5
IF: 6.7
2024-02-03
Complex & Intelligent Systems
Abstract:Abstract Due to the high heterogeneity of cancers, it is rather essential to explore driver modules with the help of gene mutation data as well as known interactions between genes/proteins. Unfortunately, latent false positive interactions are inevitable in the Protein-Protein Interaction (PPI) network. Hence in the presented method, a new weight evaluation index, based on the gene-microRNA network as well as somatic mutation profile, is introduced for weighting the PPI network first. Subsequently, the vertices in the weighted PPI network are hierarchically clustered by measuring the Mahalanobis distance of their feature vectors, extracted with the graph embedding method Node2vec. Finally, a heuristic process with dropping and extracting is conducted on the gene clusters to produce a group of gene modules. Numerous experiment results demonstrate that the proposed method exhibits superior performance to four cutting-edge identification methods in most cases regarding the capability of recognizing the acknowledged cancer-related genes, generating modules having relatively high coverage and mutual exclusivity, and are significantly enriched for specific types of cancers. The majority of the genes in the identified modules are involved in cancer-related signaling pathways, or have been reported to be carcinogenic in the literature. Furthermore, many cancer related genes detected by the proposed method are actually omitted by the four comparison methods, which has been verified in the experiments.
computer science, artificial intelligence