Detection of Gene Copy Number Change in Array CGH Data

Jing Hu,Jianbo Gao,Yinhe Cao,Weijia Zhang
DOI: https://doi.org/10.1109/lssa.2006.250402
2006-01-01
Abstract:Developing effective methods for analyzing array-CGH data to detect chromosomal aberrations is very important for the diagnosis of pathogenesis of cancer and other diseases. Current analysis methods, being largely based on smoothing and/or segmentation, are not quite capable of detecting both the aberration regions and the boundary break points very accurately. This is undesirable, since each point in the array represents a gene. Furthermore, when evaluating the accuracy of an algorithm for analyzing array-CGH data, it is commonly assumed that noise in the data follows normal distribution. A fundamental question is whether noise in array-CGH is indeed Gaussian, and if not, can one exploit the characteristics of noise to develop novel analysis methods that are capable of detecting accurately the aberration regions as well as the boundary break points simultaneously? By analyzing bacterial artificial chromosomes (BACs) arrays, oligo-nucleotide arrays, and high density NimbleGen data, we show that when there are aberrations, noise in all three types of arrays is highly non-Gaussian and possesses long-range spatial correlations, and that such noise leads to worse performance of existing methods for detecting aberrations in array-CGH than the Gaussian noise case. We further develop a novel method, which has optimally exploited the characteristics of the noise, and is capable of identifying both aberration regions as well as the boundary break points very accurately
What problem does this paper attempt to address?