Dual Hypergraph Regularized PCA for Biclustering of Tumor Gene Expression Data
Xuesong Wang,Jian Liu,Yuhu Cheng,Aiping Liu,Enhong Chen
DOI: https://doi.org/10.1109/tkde.2018.2874881
IF: 9.235
2018-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:Clustering is a powerful approach to analyze gene expression data which is crucial to the investigation of effective treatment of cancer. Many graph regularize-based clustering methods have been proposed and shown to be superior to the traditional clustering methods. However, they only focus on the inner structure in samples and fail to take the feature manifold into account. In gene expression data, it's practical to hypothesize that both the samples and the genes lie on nonlinear low dimensional manifolds, namely sample manifold and gene manifold, respectively. Therefore in this paper, incorporating the geometric structures in both samples and features, we propose a Dual Hypergraph Regularized PCA (DHPCA) method for biclustering of tumor data. First, for gene expression data, we construct two hypergraphs, i.e., sample hypergraph and gene hypergraph, to estimate the intrinsic geometric structures of samples and genes. Then, we introduce the hypergraph regularization on both gene side and sample side. Finally, our biclustering method is formulated as two hypergraph regularized PCA with closed-form solution. We experimentally validate our proposed DHPCA algorithm on real applications and the promising results indicate its potential in high dimension data analysis.