Functional analysis of multiple genomic signatures demonstrates that classification algorithms choose phenotype-related genes

W Shi,M Bessarabova,D Dosymbekov,Z Dezso,T Nikolskaya,M Dudoladova,T Serebryiskaya,A Bugrim,A Guryanov,R J Brennan,R Shah,J Dopazo,M Chen,Y Deng,T Shi,G Jurman,C Furlanello,R S Thomas,J C Corton,W Tong,L Shi,Y Nikolsky
DOI: https://doi.org/10.1038/tpj.2010.35
2010-01-01
The Pharmacogenomics Journal
Abstract:Gene expression signatures of toxicity and clinical response benefit both safety assessment and clinical practice; however, difficulties in connecting signature genes with the predicted end points have limited their application. The Microarray Quality Control Consortium II (MAQCII) project generated 262 signatures for ten clinical and three toxicological end points from six gene expression data sets, an unprecedented collection of diverse signatures that has permitted a wide-ranging analysis on the nature of such predictive models. A comprehensive analysis of the genes of these signatures and their nonredundant unions using ontology enrichment, biological network building and interactome connectivity analyses demonstrated the link between gene signatures and the biological basis of their predictive power. Different signatures for a given end point were more similar at the level of biological properties and transcriptional control than at the gene level. Signatures tended to be enriched in function and pathway in an end point and model-specific manner, and showed a topological bias for incoming interactions. Importantly, the level of biological similarity between different signatures for a given end point correlated positively with the accuracy of the signature predictions. These findings will aid the understanding, and application of predictive genomic signatures, and support their broader application in predictive medicine.
What problem does this paper attempt to address?