Cancer Subtype Identification Through Integrating Inter and Intra Dataset Relationships in Multi-Omics Data

M. Peelen,L. Bagheriye,J. Kwisthout,Mark Peelen,Leila Bagheriye,Johan Kwisthout
DOI: https://doi.org/10.1109/access.2024.3362647
IF: 3.9
2024-03-02
IEEE Access
Abstract:The integration of multi-omics data has emerged as a promising approach for gaining comprehensive insights into complex diseases such as cancer. This paper proposes a novel approach to identify cancer subtypes through the integration of multi-omics data for clustering. The proposed method, named LIDAF utilises affinity matrices based on linear relationships between and within different omics datasets (Linear Inter and Intra Dataset Affinity Fusion (LIDAF)). Canonical Correlation Analysis is in this paper employed to create distance matrices based on Euclidean distances between canonical variates. The distance matrices are converted to affinity matrices and those are fused in a three-step process. The proposed LIDAF addresses the limitations of the existing method resulting in improvement of clustering performance as measured by the Adjusted Rand Index and the Normalized Mutual Information score. Moreover, our proposed LIDAF approach demonstrates a notable enhancement in 50% of the -log10 rank p-values obtained from Cox survival analysis, surpassing the performance of the best-reported method. This improvement in -log10 rank p-values stands out as a major finding, underscoring the significance of the proposed method in identifying distinct cancer subtypes.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?