Distributional Clustering Using Nonnegative Matrix Factorization

Zhenfeng Zhu,Yangdong Ye
DOI: https://doi.org/10.1109/wcica.2012.6359370
2012-01-01
Intelligent Control and Automation
Abstract:In this paper, we propose an iterative distributional clustering algorithm based on non-negative matrix factorization (DCMF). When factorizing a data matrix A into C×M, an objective function is defined to impose the conditional distribution constraints on the base matrix C and the coefficient matrix M. It has been observed that, in many applications, the conditional distributions of instances are often employed to normalize the data dimensions. Taking these factors into account, we simplify the existent updating rules and obtain the iterative algorithm DCMF. This algorithm satisfies the constraints described above on condition that the instance matrix is preprocessed as a conditional distribution. DCMF is simple, effective, and only needs to initialize the coefficient matrix. As a result, the base matrix can be viewed as a centroid matrix and the coefficient matrix just records the membership of fuzzy clustering. Compared with several other factorization algorithms, the experimental results on text, gene, and image data demonstrate that DCMF achieves 8.06% clustering accuracy improvement, 35.08% computational time reduction, and 61.30% hard clustering fuzziness decrease.
What problem does this paper attempt to address?