Pathway Analysis for RNA-Seq Data Using a Score-Based Approach

Yi-Hui Zhou
DOI: https://doi.org/10.1111/biom.12372
IF: 1.701
2015-01-01
Biometrics
Abstract:A variety of pathway/gene-set approaches have been proposed to provide evidence of higher-level biological phenomena in the association of expression with experimental condition or clinical outcome. Among these approaches, it has been repeatedly shown that resampling methods are far preferable to approaches that implicitly assume independence of genes. However, few approaches have been optimized for the specific characteristics of RNA-Seq transcription data, in which mapped tags produce discrete counts with varying library sizes, and with potential outliers or skewness patterns that violate parametric assumptions. We describe transformations to RNA-Seq data to improve power for linear associations with outcome and flexibly handle normalization factors. Using these transformations or alternate transformations, we apply recently developed null approximations to quadratic form statistics for both self-contained and competitive pathway testing. The approach provides a convenient integrated platform for RNA-Seq pathway testing. We demonstrate that the approach provides appropriate type I error control without actual permutation and is powerful under many settings in comparison to competing approaches. Pathway analysis of data from a study of F344 vs. HIV1Tg rats, and of sex differences in lymphoblastoid cell lines from humans, strongly supports the biological interpretability of the findings.
What problem does this paper attempt to address?