Identification of cancer genes using a statistical framework for multi-experiment analysis of non-discretized array CGH data (vol 36, pg 13, 2008)

Christiaan Klijn,Henne Holstege,Jeroen de Ridder,Xiaoling Liu,Marcel Reinders,Jos Jonkers,Lodewyk Wessels
DOI: https://doi.org/10.1093/nar/gkm1143
IF: 14.9
2008-01-01
Nucleic Acids Research
Abstract:Tumor formation is in part driven by DNA copy number alterations (CNAs), which can be measured using microarray-based Comparative Genomic Hybridization (aCGH). Multiexperiment analysis of aCGH data from tumors allows discovery of recurrent CNAs that are potentially causal to cancer development. Until now, multiexperiment aCGH data analysis has been dependent on discretization of measurement data to a gain, loss or no-change state. Valuable biological information is lost when a heterogeneous system such as a solid tumor is reduced to these states. We have developed a new approach which inputs nondiscretized aCGH data to identify regions that are significantly aberrant across an entire tumor set. Our method is based on kernel regression and accounts for the strength of a probes signal, its local genomic environment and the signal distribution across multiple tumors. In an analysis of 89 human breast tumors, our method showed enrichment for known cancer genes in the detected regions and identified aberrations that are strongly associated with breast cancer subtypes and clinical parameters. Furthermore, we identified 18 recurrent aberrant regions in a new dataset of 19 p53-deficient mouse mammary tumors. These regions, combined with gene expression microarray data, point to known cancer genes and novel candidate cancer genes.
What problem does this paper attempt to address?