Abstract:Cancer is a highly heterogeneous disease with significant variability in molecular features and clinical outcomes, making diagnosis and treatment challenging. In recent years, high-throughput omic technologies have facilitated the discovery of mechanisms underlying various cancer subtypes by providing diverse omics data, such as gene expression, DNA methylation, and miRNA expression. However, the complexity and heterogeneity of multi-omics data present significant challenges for their integration in exploring cancer subtypes. Various methods have been proposed to address these challenges. In this paper, we propose a novel and straightforward approach for identifying cancer subtypes by integrating patient-specific subnetworks features from different omics data. We construct patient-specific induced subnetwork using a random walk with restart algorithm from patient similarity networks (PSNs) and compute nine structural properties that capture essential network topology. These features are integrated across the three omic datasets to form comprehensive patient profiles. K-means clustering is then applied for cancer subtype identification. We evaluate our approach on five cancer datasets, including breast invasive carcinoma, colon adenocarcinoma, glioblastoma multiforme, kidney renal clear cell carcinoma, and lung squamous cell carcinoma, for three different omic data types. The evaluation shows that our method produces promising and effective results, demonstrating competitive or superior performance compared to existing methods and underscoring its potential for advancing personalized cancer diagnosis and treatment.

What problem does this paper attempt to address?

This paper aims to solve the problem of cancer subtype identification. Specifically, cancer is a highly heterogeneous disease. Even within the same type of cancer, there are significant differences in molecular characteristics and clinical outcomes, which makes effective diagnosis and treatment very challenging. In recent years, the development of high - throughput omics technologies has promoted the discovery of multiple cancer subtype mechanisms and provided various omics data such as gene expression, DNA methylation and miRNA expression. However, the complexity and heterogeneity of multi - omics data pose significant challenges to the integration of these data for exploring cancer subtypes. To solve these problems, this paper proposes a novel and straightforward method to identify cancer subtypes by integrating patient - specific sub - network features from different omics data. The specific steps are as follows: 1. **Constructing Patient Similarity Network (PSN)**: For each type of omics data, use the cosine similarity measure to construct a Patient Similarity Network (PSN), emphasizing the association of patients with similar molecular profiles based on each type of omics. 2. **Sub - network construction**: From the PSN of each type of omics, use the random walk with restart algorithm to generate an induced sub - network for each patient, exploring the neighboring and remote nodes in the PSN. 3. **Sub - network feature extraction**: Calculate nine structural properties from each sub - network, which capture important aspects of network topology, including average node degree, average node strength, coefficient of variation of node strength, weighted density, trace, the largest and second - largest eigenvalues of the Laplacian matrix, average clustering coefficient, average weighted betweenness centrality and average weighted closeness centrality. 4. **Network feature fusion**: Average - aggregate the feature vectors from the three omics data to form a comprehensive patient feature vector. 5. **Sample clustering**: Apply the K - means clustering algorithm to the aggregated feature vectors, and use the silhouette score to determine the optimal number of clusters, thereby identifying cancer subtypes. The paper demonstrates the effectiveness and superiority of this method in cancer subtype identification by evaluating the performance of this method on five cancer datasets and comparing it with four existing methods. These results indicate that this method can effectively overcome the complexity and heterogeneity of multi - omics data and provides a potential tool for personalized cancer diagnosis and treatment.

Personalized graph feature-based multi-omics data integration for cancer subtype identification

A Multi-Omics Integration Framework Using Graph Attention Networks for Cancer Subtype Prediction

Molecular Subtyping of Cancer Based on Robust Graph Neural Network and Multi-Omics Data Integration

Supervised Graph Clustering for Cancer Subtyping Based on Survival Analysis and Integration of Multi-Omic Tumor Data

Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods

Polycyclic aromatic hydrocarbons in indoor and outdoor environments and factors affecting their concentrations.

Multi-omics integration with weighted affinity and self-diffusion applied for cancer subtypes identification

Integrating genetic and gene expression data in network-based stratification analysis of cancers

MDICC: novel method for multi-omics data integration and cancer subtype identification

Integrate Any Omics: Towards genome-wide data integration for patient stratification

Multi-Omics Data Fusion via a Joint Kernel Learning Model for Cancer Subtype Discovery and Essential Gene Identification

Robust correlation estimation and UMAP assisted topological analysis of omics data for disease subtyping

A Contrastive-Learning-Based Deep Neural Network for Cancer Subtyping by Integrating Multi-Omics Data

Prognostically Relevant Subtypes and Survival Prediction for Breast Cancer Based on Multimodal Genomics Data

Identification of subtypes in digestive system tumors based on multi-omics data and graph convolutional network

An integrated network representation of multiple cancer-specific data for graph-based machine learning

A Multimodal Graph Neural Network Framework of Cancer Molecular Subtype Classification

Multi-Omics Data Fusion for Cancer Molecular Subtyping Using Sparse Canonical Correlation Analysis

Integrative data analysis of multi-platform cancer data with a multimodal deep learning approach

Patient-Specific Network for Personalized Breast Cancer Therapy with Multi-Omics Data