Complete canonical correlation analysis for multi-omic molecular subtyping of colorectal cancer

L Qi,W Wang,X Xing,K Wang
2018-01-01
Abstract:Colorectal cancer (CRC) is one of the most commonly diagnosed cancers, and one of the leading causes of cancer-related death. Similar to other major malignancies, colon cancer is a heterogeneous disease, posing a great challenge to selection of patients for optimized therapy. Recently, the colorectal cancer subtyping consortium identified four consensus molecular subtypes (CMSs), i.e., CMS1 (MSI immune), CMS2 (Canonical), CMS3 (Metabolic) and CMS4 (Mesenchymal), with distinct biological characteristics and clinical associations. However, the classification system developed by the CRC Subtyping Consortium (CRCSC) can only be applied to transcriptome data, which greatly limited its potentially widespread applications to other types of omic data. Here, we address the challenge by developing a multi-omic classifier based on data fusion using Canonical Correlation Analysis (CCA), which is a well-established method popular in pattern recognition, such as multi-view gait recognition, facial expression recognition, handwritten digits recognition. Using colon cancer as a case study, we demonstrated that integrating different types of omic data using Complete Canonical Correlation Analysis (C3A) followed by classification based on support vector machine provides a novel multi-omic cancer classification framework. Compared to single-omic classification, multi-omic classification substantially improved the performance.
What problem does this paper attempt to address?