Penalized -regression-based bicluster localization

Hanjia Gao,Zhengjian Bai,Weiguo Gao,Shuqin Zhang
DOI: https://doi.org/10.1016/j.patcog.2021.107984
IF: 8
2021-01-01
Pattern Recognition
Abstract:Biclustering (co-clustering, two-mode clustering), as one of the classical unsupervised learning meth-ods, has been applied in many different fields in recent years. Different types of biclustering methods have been developed such as probabilistic methods, two-way clustering methods, variance minimization methods, and so on. However, few regression-based methods have been proposed to the best of our knowledge. Such methods have been applied in traditional clustering, which can improve both the com-putational efficiency and the clustering accuracy. In this paper, we present a penalized regression-based method for localizing the biclusters (PRbiclust). By imposing Truncated LASSO Penalty (TLP) and group TLP terms to penalize the column vectors and the row vectors in the regression model, the structure of biclusters in the data matrix is recovered. The model is formulated as an optimization problem with nonconvex penalties, and a computationally efficient algorithm is proposed to solve it. Convergence of the algorithm is proved. To extract the biclusters from the recovered data matrix, we propose a graph-based localization method. An evaluation criterion is also proposed to measure the efficiency of bicluster localization when noise entries exist. We apply the proposed method to both simulated datasets with different setups and a real dataset. Experiments show that this method can well capture the bicluster structure, and performs better than the existing works. (c) 2021 Elsevier Ltd. All rights reserved.
What problem does this paper attempt to address?