Nonparametric Reduced-Rank Regression for Multi-SNP, Multi-Trait Association Mapping

Ashlee Valente,Geoffrey Ginsburg,Barbara E Engelhardt
DOI: https://doi.org/10.48550/arXiv.1512.02306
2015-12-08
Abstract:Genome-wide association studies have proven to be essential for understanding the genetic basis of disease. However, many complex traits---personality traits, facial features, disease subtyping---are inherently high-dimensional, impeding simple approaches to association mapping. We developed a nonparametric Bayesian reduced rank regression model for multi-SNP, multi-trait association mapping that does not require the rank of the linear subspace to be specified. We show in simulations and real data that our model shares strength over SNPs and over correlated traits, improving statistical power to identify genetic associations with an interpretable, SNP-supervised low-dimensional linear projection of the high-dimensional phenotype. On the HapMap phase 3 gene expression QTL study data, we identify pleiotropic expression QTLs that classical univariate tests are underpowered to find and that two step approaches cannot recover. Our Python software, BERRRI, is publicly available at GitHub: <a class="link-external link-https" href="https://github.com/ashlee1031/BERRRI" rel="external noopener nofollow">this https URL</a>.
Applications,Genomics,Machine Learning
What problem does this paper attempt to address?