Gene selection using biological knowledge and fuzzy clustering

S. Ghosh,S. Mitra
DOI: https://doi.org/10.1109/FUZZ-IEEE.2012.6250797
2012-06-10
Abstract:Gene expression data being high-dimensional and redundant, dimensionality reduction is of prime concern. We employ the algorithm Fuzzy Clustering Large Applications based on RAN-domized Search (FCLARANS) for attribute clustering and dimensionality reduction based on the study of gene ontology and differential gene expressions. The use of domain knowledge helps in the automated selection of biologically meaningful partitions. The use of Gene Ontology (GO) study helps in detecting biologically enriched and statistically significant clusters. Fold-change is measured to select the differentially expressed genes as the representatives of these clusters. Tools like Eisen plot and cluster profiles of these clusters help establish their coherence. Important representative features (or genes) are extracted from each enriched gene partitions to form the reduced gene space. While the reduced gene set forms a biologically meaningful gene space it simultaneously leads to a decrease in computational burden. External validation of the reduced subspace, using various well-known classifiers, establishes the effectiveness of the proposed methodology.
Computer Science,Biology
What problem does this paper attempt to address?