ELMO: an Efficient Logistic Regression-Based Multi-Omic Integrated Analysis Method for Breast Cancer Intrinsic Subtypes

Yexian Zhang,Ruoyao Shi,Chaorong Chen,Meiyu Duan,Shuai Liu,Yanjiao Ren,Lan Huang,Xiaofeng Dai,Fengfeng Zhou
DOI: https://doi.org/10.1109/access.2019.2960373
IF: 3.9
2019-01-01
IEEE Access
Abstract:Breast cancer is one of the most frequently occurring female cancer types and represents a major cause of death among women worldwide. Breast cancer is heterogeneous in both molecular characteristics and clinical outcomes for its different molecular subtypes. High-throughput technologies facilitated the fast accumulations of the multiple Omic data for cancer patients. These data sources posed a computational challenge for the efficient integrated multi-Omic analysis. The existing studies usually investigated the differential representation or machine learning problems using a single type of Omic data. This study hypothesized that different Omic types contributed complementary information to each other, and their integrated analysis may improve the single-Omic models. An efficient logistic regression-based multi-Omic integrated analysis method (ELMO) was proposed to integrate the RNA-seq and DNA methylation data to detect the breast cancer intrinsic subtypes. ELMO achieved the highest accuracy with a smaller number of features compared with the existing filter and wrapper feature selection methods in this study. The experimental data supported our hypothesis that multi-Omic models outperformed the single-Omic ones.
What problem does this paper attempt to address?