Partially-independent component analysis for tissue heterogeneity correction in microarray gene expression analysis

yue wang,junying zhang,javed i khan,r n clarke,z gu
DOI: https://doi.org/10.1109/NNSP.2003.1318001
2003-01-01
Abstract:Gene microarray technologies provide powerful tools for the large scale analysis of gene expression in cancer research. Clinical applications often aim to facilitate a molecular classification of cancers based on discriminatory genes associated with different clinical stages or outcomes. However, gene expression profiles often represent a composite of more than one distinct source due to tissue heterogeneity, and could result in extracting signatures reflecting the proportion of stromal contamination in the sample, rather than underlying tumor biology. We therefore wish to introduce a computational approach, which allows for a blind decomposition of gene expression profiles from mixed cell populations. The algorithm is based on a linear latent variable model, whose parameters are estimated using partially-independent component analysis, supported by a subset of differentially-expressed genes. We demonstrate the principle of the approach on the data sets derived from mixed cell lines of small round blue cell tumors. Because accurate source separation can be achieved blindly and numerically, we anticipate that computational correction of tissue heterogeneity would be useful in a wide variety of gene microarray studies.
What problem does this paper attempt to address?