Constrained Neighborhood Preserving Concept Factorization for Data Representation

Mei Lu,Li Zhang,Xiangjun Zhao,Fanzhang Li
DOI: https://doi.org/10.1016/j.knosys.2016.04.003
IF: 8.139
2016-01-01
Knowledge-Based Systems
Abstract:Matrix factorization based techniques, such as nonnegative matrix factorization (NMF) and concept factorization (CF), have attracted a great deal of attentions in recent years, mainly due to their ability of dimension reduction and sparse data representation. Both techniques are of unsupervised nature and thus do not make use of a priori knowledge to guide the clustering process. This could lead to inferior performance in some scenarios. As a remedy to this, a semi-supervised learning method called Pairwise Constrained Concept Factorization (PCCF) was introduced to incorporate some pairwise constraints into the CF framework. Despite its improved performance, PCCF uses only a priori knowledge and neglects the proximity information of the whole data distribution; this could lead to rather poor performance (although slightly improved comparing to CF) when only limited a priori information is available. To address this issue, we propose in this paper a novel method called Constrained Neighborhood Preserving Concept Factorization (CNPCF). CNPCF utilizes both a priori knowledge and local geometric structure of the dataset to guide its clustering. Experimental studies on three real-world clustering tasks demonstrate that our method yields a better data representation and achieves much improved clustering performance in terms of accuracy and mutual information comparing to the state-of-the-arts techniques.
What problem does this paper attempt to address?