Integrating Heterogeneous Datasets by Using Multimodal Deep Learning

Fariba Khoshghalbvash,Jean X. Gao
DOI: https://doi.org/10.1007/978-981-13-6508-9_35
2019-06-14
Abstract:Rapid collection of data sources, varying in volume and structure poses a challenge for scientists to establish a practical approach to manipulating heterogeneous data sources. A multimodal learning and an integrated analysis make it possible to extract much worthwhile information from a collection of multiple simple raw data. Therefore, data integration can lead to a more reliable and robust result. High-throughput sequencing technologies, especially next-generation sequencing, leave us with multi-platform genomic data such as gene expression, SNP, CNV, DNA methylation, and miRNA expression. In this paper, we represented a multimodal deep neural network to exploit the mutual information between three different modalities to classify breast cancer patients into two groups based on their survival rate. Experimental results indicate that our method improves the classification accuracy and performs better on imbalanced data compared to the other single-modal state-of-the-art methods.
What problem does this paper attempt to address?