Prediction of compound signature using high density gene expression profiling

Hisham K Hamadeh,Pierre R Bushel,Supriya Jayadev,Olimpia DiSorbo,Lee Bennett,Leping Li,Raymond Tennant,Raymond Stoll,J Carl Barrett,Richard S Paules,Kerry Blanchard,Cynthia A Afshari
DOI: https://doi.org/10.1093/toxsci/67.2.232
Abstract:DNA microarrays, used to measure the gene expression of thousands of genes simultaneously, hold promise for future application in efficient screening of therapeutic drugs. This will be aided by the development and population of a database with gene expression profiles corresponding to biological responses to exposures to known compounds whose toxicological and pathological endpoints are well characterized. Such databases could then be interrogated, using profiles corresponding to biological responses to drugs after developmental or environmental exposures. A positive correlation with an archived profile could lead to some knowledge regarding the potential effects of the tested compound or exposure. We have previously shown that cDNA microarrays can be used to generate chemical-specific gene expression profiles that can be distinguished across and within compound classes, using clustering, simple correlation, or principal component analyses. In this report, we test the hypothesis that knowledge can be gained regarding the nature of blinded samples, using an initial training set comprised of gene expression profiles derived from rat liver exposed to clofibrate, Wyeth 14,643, gemfibrozil, or phenobarbital for 24 h or 2 weeks of exposure. Highly discriminant genes were derived from our database training set using approaches including linear discriminant analysis (LDA) and genetic algorithm/K-nearest neighbors (GA/KNN). Using these genes in the analysis of coded liver RNA samples derived from 24-h, 3-day, or 2-week exposures to phenytoin, diethylhexylpthalate, or hexobarbital led to successful prediction of whether these samples were derived from livers of rats exposed to enzyme inducers or to peroxisome proliferators. This validates our initial hypothesis and lends credibility to the concept that the further development of a gene expression database for chemical effects will greatly enhance the hazard identification processes.
What problem does this paper attempt to address?