Abstract:The identification of co-regulated genes and their Transcription-Factor Binding Sites (TFBSs) are the key steps toward understanding transcription regulation. In addition to effective laboratory assays, various bi-clustering algorithms for the detection of the co-expressed genes have been developed. Bi-clustering methods are used to discover subgroups of genes with similar expression patterns under to-be-identified subsets of experimental conditions when applied to gene expression data. By building two fuzzy partition matrices of the gene expression data with the Axiomatic Fuzzy Set (AFS) theory, this paper proposes a novel fuzzy bi-clustering algorithm for the identification of co-regulated genes. Specifically, the gene expression data are transformed into two fuzzy partition matrices via the sub-preference relations theory of AFS at first. One of the matrices considers the genes as the universe and the conditions as the concept, and the other one considers the genes as the concept and the conditions as the universe. The identification of the co-regulated genes (bi-clusters) is carried out on the two partition matrices at the same time. Then, a novel fuzzy-based similarity criterion is defined based on the partition matrices, and a cyclic optimization algorithm is designed to discover the significant bi-clusters at the expression level. The above procedures guarantee that the generated bi-clusters have more significant expression values than those extracted by the traditional bi-clustering methods. Finally, the performance of the proposed method is evaluated with the performance of the three well-known bi-clustering algorithms on publicly available real microarray datasets. The experimental results are in agreement with the theoretical analysis and show that the proposed algorithm can effectively detect the co-regulated genes without any prior knowledge of the gene expression data.

Study on Dynamic Clustering Analysis Method for Gene Expression Data Based on Multidimension Pseudo F-statistics

Estimating the Number of Clusters Via System Evolution for Cluster Analysis of Gene Expression Data.

Effective Clustering Algorithms for Gene Expression Data

Min max kurtosis distance based improved initial centroid selection approach of K-means clustering for big data mining on gene expression data

An Analysis of Gene Expression Data using Penalized Fuzzy C-Means Approach

Genetic Algorithms Applied to Multi-Class Clustering for Gene Expression Data

Data Clustering Algorithm for DNA Microarray Based on Graph Theory

An Improved Biclustering Method for Analyzing Gene Expression Profiles

Clustering gene expression data based on predicted differential effects of GV interaction.

Performance Analysis of Clustering Algorithms for Gene Expression Data

Application of New Clustering Algorithms in Gene Expression Data

Feature Weight-Based FCM Clustering Algorithm for Gene Expression Data

T-cell receptor delta gene recombination in common acute lymphoblastic leukemia: preferential usage of V delta 2 and frequent involvement of the J alpha cluster.

A Novel Fuzzy Bi-Clustering Algorithm with AFS for Identification of Co-Regulated Genes

Gen-Cluster: an Efficient Gene Expression Data High Dimensional Clustering Algorithm

A Novel Fuzzy Bi-Clustering Algorithm with Axiomatic Fuzzy Set for Identification of Co-Regulated Genes

Performance Analysis of Enhanced Clustering Algorithm for Gene Expression Data

An Efficient Artificial Bee Colony and Fuzzy C Means Based Co-regulated Biclustering from Gene Expression Data

A Graph-based Approach to Estimating the Number of Clusters

A Gene Selection Method for GeneChip Array Data with Small Sample Sizes

Tendency based Subspace Clustering on Gene Expression Data