A Novel Approach to Revealing Positive and Negative Co-Regulated Genes.
Yu-Hai Zhao,Guo-Ren Wang,Ying Yin,Guang-Yu Xu
DOI: https://doi.org/10.1007/s11390-007-9033-7
IF: 1.871
2007-01-01
Journal of Computer Science and Technology
Abstract:As explored by biologists, there is a real and emerging need to identify co-regulated gene clusters, which include both positive and negative regulated gene clusters. However, the existing pattern-based and tendency-based clustering approaches are only designed for finding positive regulated gene clusters. In this paper, a new subspace clustering model called g-Cluster is proposed for gene expression data. The proposed model has the following advantages: 1) find both positive and negative co-regulated genes in a shot, 2) get away from the restriction of magnitude transformation relationship among co-regulated genes, and 3) guarantee quality of clusters and significance of regulations using a novel similarity measurement gCode and a user-specified regulation threshold δ, respectively. No previous work measures up to the task which has been set. Moreover, MDL technique is introduced to avoid insignificant g-Clusters generated. A tree structure, namely GS-tree, is also designed, and two algorithms combined with efficient pruning and optimization strategies to identify all qualified g-Clusters. Extensive experiments are conducted on real and synthetic datasets. The experimental results show that 1) the algorithm is able to find an amount of co-regulated gene clusters missed by previous models, which are potentially of high biological significance, and 2) the algorithms are effective and efficient, and outperform the existing approaches.