Concept Factorization With Local Centroids

Mulin Chen,Xuelong Li
DOI: https://doi.org/10.1109/TNNLS.2020.3027068
Abstract:Data clustering is a fundamental problem in the field of machine learning. Among the numerous clustering techniques, matrix factorization-based methods have achieved impressive performances because they are able to provide a compact and interpretable representation of the input data. However, most of the existing works assume that each class has a global centroid, which does not hold for data with complicated structures. Besides, they cannot guarantee that the sample is associated with the nearest centroid. In this work, we present a concept factorization with the local centroids (CFLCs) approach for data clustering. The proposed model has the following advantages: 1) the samples from the same class are allowed to connect with multiple local centroids such that the manifold structure is captured; 2) the pairwise relationship between the samples and centroids is modeled to produce a reasonable label assignment; and 3) the clustering problem is formulated as a bipartite graph partitioning task, and an efficient algorithm is designed for optimization. Experiments on several data sets validate the effectiveness of the CFLC model and demonstrate its superior performance over the state of the arts.
What problem does this paper attempt to address?