Abstract:In comparisons between mutant and wild-type genotypes, transcriptome analysis can reveal the direct impacts of a mutation, together with the homeostatic responses of the biological system. Recent studies have highlighted that, when the effects of homozygosity for recessive mutations are studied in non-isogenic backgrounds, genes located proximal to the mutation on the same chromosome often appear over-represented among those genes identified as differentially expressed (DE). One hypothesis suggests that DE genes chromosomally linked to a mutation may not reflect functional responses to the mutation but, instead, result from an unequal distribution of expression quantitative trait loci (eQTLs) between sample groups of mutant or wild-type genotypes. This is problematic because eQTL expression differences are difficult to distinguish from genes that are DE due to functional responses to a mutation. Here we show that chromosomally co-located differentially expressed genes (CC-DEGs) are also observed in analyses of dominant mutations in heterozygotes. We define a method and a metric to quantify, in RNA-sequencing data, localised differential allelic representation (DAR) between those sample groups subjected to differential expression analysis. We show how the DAR metric can predict regions prone to eQTL-driven differential expression, and how it can improve functional enrichment analyses through gene exclusion or weighting-based approaches. Advantageously, this improved ability to identify probable eQTLs also reveals examples of CC-DEGs that are likely to be functionally related to a mutant phenotype. This supports a long-standing prediction that selection for advantageous linkage disequilibrium influences chromosome evolution. By comparing the genomes of zebrafish ( Danio rerio ) and medaka ( Oryzias latipes ), a teleost with a conserved ancestral karyotype, we find possible examples of chromosomal aggregation of CC-DEGs during evolution of the zebrafish lineage. Our method for DAR analysis requires only RNA-sequencing data, facilitating its application across new and existing datasets. Many human-relevant diseases result from genetic mutations that disrupt cellular functions. We can model these mutations in other organisms (e.g. mouse, zebrafish) and employ gene expression analysis (transcriptomics) to determine how mutations directly affect cells and how cells adjust expression of their genes to compensate for these mutations. In our transcriptome analyses of dominant disease-causative mutations in zebrafish, we identified an interesting phenomenon where a disproportionate number of differentially expressed genes reside on the same chromosome as a mutated gene. Here, we provide strong evidence supporting that the differential expression of some of these chromosomally co-located genes is not due to the mutation but is due to differential segregation of gene alleles with innately different expression levels (i.e. expression quantitative trait loci, eQTLs). We have developed a procedure to measure the likelihood of differential gene expression being due to an eQTL. This allows us to compensate for the presence of such eQTLs in bioinformatic analyses. Our procedure, Differential Allelic Representation (DAR) analysis, revealed evidence for aggregation of genes with related functions on the same chromosome over evolutionary timescales. DAR analysis allows disentanglement of eQTLs from mutation-dependent gene expression responses, thereby permitting more comprehensive investigation of transcriptome data.

Robust identification of regulatory variants (eQTLs) using a differential expression framework developed for RNA-sequencing

SNP-Based and Kmer-Based eQTL Analysis Using Transcriptome Data

Haplotype-aware modeling of cis-regulatory effects highlights the gaps remaining in eQTL data

Degps is a Powerful Tool for Detecting Differential Expression in RNA-sequencing Studies

Powerful eQTL mapping through low coverage RNA sequencing

A statistical framework for joint eQTL analysis in multiple tissues

The Single-Cell Eqtlgen Consortium

eQTL Mapping via Effective SNP Ranking and Screening

An Information-Theoretic Machine Learning Approach to Expression QTL Analysis

Accounting for isoform expression increases power to identify genetic regulation of gene expression

Gene Set Enrichment in Eqtl Data Identifies Novel Annotations and Pathway Regulators.

Single-cell Eqtlgen Consortium: a Personalized Understanding of Disease

-eQTL mapping in gene sets identifies network effects of genetic variants

Quantifying genetic regulatory variation in human populations improves transcriptome analysis in rare disease patients

Joint eQTL mapping and inference of gene regulatory network improves power of detecting both cis - and trans -eQTLs

Probing the limits of cis-acting gene regulation using a model of allelic imbalance quantitative trait loci

Discovering single-cell eQTLs from scRNA-seq data only

The effect of genetic variation on promoter usage and enhancer activity

Differential allelic representation (DAR) identifies candidate eQTLs and improves transcriptome analysis

Network-based group variable selection for detecting expression quantitative trait loci (eQTL)

Cross-Population Joint Analysis of eQTLs: Fine Mapping and Functional Annotation