Accounting for non-genetic factors by low-rank representation and sparse regression for eQTL mapping.

Can Yang,Lin Wang,Shuqin Zhang,Hongyu Zhao
DOI: https://doi.org/10.1093/bioinformatics/btt075
IF: 5.8
2013-01-01
Bioinformatics
Abstract:Expression quantitative trait loci (eQTL) studies investigate how gene expression levels are affected by DNA variants. A major challenge in inferring eQTL is that a number of factors, such as unobserved covariates, experimental artifacts and unknown environmental perturbations, may confound the observed expression levels. This may both mask real associations and lead to spurious association findings.In this article, we introduce a LOw-Rank representation to account for confounding factors and make use of Sparse regression for eQTL mapping (LORS). We integrate the low-rank representation and sparse regression into a unified framework, in which single-nucleotide polymorphisms and gene probes can be jointly analyzed. Given the two model parameters, our formulation is a convex optimization problem. We have developed an efficient algorithm to solve this problem and its convergence is guaranteed. We demonstrate its ability to account for non-genetic effects using simulation, and then apply it to two independent real datasets. Our results indicate that LORS is an effective tool to account for non-genetic effects. First, our detected associations show higher consistency between studies than recently proposed methods. Second, we have identified some new hotspots that can not be identified without accounting for non-genetic effects.The software is available at: http://bioinformatics.med.yale.edu/software.aspx.Supplementary data are available at Bioinformatics online.
What problem does this paper attempt to address?