Strategic Multi-Omics Data Integration via Multi-Level Feature Contrasting and Matching

Jinli Zhang,Hongwei Ren,Zongli Jiang,Zheng Chen,Ziwei Yang,Yasuko Matsubara,Yasushi Sakurai
DOI: https://doi.org/10.1109/tnb.2024.3456797
IF: 3.9
2024-10-18
IEEE Transactions on NanoBioscience
Abstract:The analysis and comprehension of multi-omics data has emerged as a prominent topic in the field of bioinformatics and data science. However, the sparsity characteristics and high dimensionality of omics data pose difficulties in terms of extracting meaningful information. Moreover, the heterogeneity inherent in multiple omics sources makes the effective integration of multi-omics data challenging To tackle these challenges, we propose MFCC-SAtt, a multi-level feature contrast clustering model based on self-attention to extract informative features from multi-omics data. MFCC-SAtt treats each omics type as a distinct modality and employs autoencoders with self-attention for each modality to integrate and compress their respective features into a shared feature space. By utilizing a multi-level feature extraction framework along with incorporating a semantic information extractor, we mitigate optimization conflicts arising from different learning objectives. Additionally, MFCC-SAtt guides deep clustering based on multi-level features which further enhances the quality of output labels. By conducting extensive experiments on multi-omics data, we have validated the exceptional performance of MFCC-SAtt. For instance, in a pan-cancer clustering task, MFCC-SAtt achieved an accuracy of over 80.38%.
biochemical research methods,nanoscience & nanotechnology
What problem does this paper attempt to address?