A Pipeline for High-Throughput Concentration Response Modeling of Gene Expression for Toxicogenomics

John S. House,Fabian A. Grimm,Dereje D. Jima,Yi-Hui Zhou,Ivan Rusyn,Fred A. Wright
DOI: https://doi.org/10.3389/fgene.2017.00168
IF: 3.7
2017-01-01
Frontiers in Genetics
Abstract:Cell-based assays are an attractive option to measure gene expression response to exposure, but the cost of whole-transcriptome RNA sequencing has been a barrier to the use of gene expression profiling for in vitro toxicity screening. In addition, standard RNA sequencing adds variability due to variable transcript length and amplification. Targeted probe-sequencing technologies such as TempO-Seq, with transcriptomic representation that can vary from hundreds of genes to the entire transcriptome, may reduce some components of variation. Analyses of high-throughput toxicogenomics data require renewed attention to read-calling algorithms and simplified dose-response modeling for datasets with relatively few samples. Using data from induced pluripotent stem cell-derived cardiomyocytes treated with chemicals at varying concentrations, we describe here and make available a pipeline for handling expression data generated by TempO-Seq to align reads, clean and normalize raw count data, identify differentially expressed genes, and calculate transcriptomic concentration-response points of departure. The methods are extensible to other forms of concentration-response gene-expression data, and we discuss the utility of the methods for assessing variation in susceptibility and the diseased cellular state.
What problem does this paper attempt to address?