Bayesian Variable Selection with Structure Learning: Applications in Integrative Genomics

Suprateek Kundu,Minsuk Shin,Yichen Cheng,Ganiraju Manyam,Bani K. Mallick,Veera Baladandayuthapani
DOI: https://doi.org/10.48550/arXiv.1508.02803
2015-08-12
Methodology
Abstract:Significant advances in biotechnology have allowed for simultaneous measurement of molecular data points across multiple genomic and transcriptomic levels from a single tumor/cancer sample. This has motivated systematic approaches to integrate multi-dimensional structured datasets since cancer development and progression is driven by numerous co-ordinated molecular alterations and the interactions between them. We propose a novel two-step Bayesian approach that combines a variable selection framework with integrative structure learning between multiple sources of data. The structure learning in the first step is accomplished through novel joint graphical models for heterogeneous (mixed scale) data allowing for flexible incorporation of prior knowledge. This structure learning subsequently informs the variable selection in the second step to identify groups of molecular features within and across platforms associated with outcomes of cancer progression. The variable selection strategy adjusts for collinearity and multiplicity, and also has theoretical justifications. We evaluate our methods through simulations and apply them to a motivating genomic (DNA copy number and methylation) and transcriptomic (mRNA expression) data for assessing important markers associated with Glioblastoma progression.
What problem does this paper attempt to address?