Autoencoder-like semi-NMF multiple clustering

Shihong Yao,Chuli Hu,Tao Wang,Xinyou Cui
DOI: https://doi.org/10.1016/j.ins.2021.04.080
IF: 8.1
2021-09-01
Information Sciences
Abstract:<p>Clustering is performed to partition samples into disjoint groups for facilitating the discovery of hidden patterns in the data. Many real-world applications involve various <a class="topic-link" href="/topics/computer-science/clustering-method">clustering methods</a><span><span>, most of which only produce a single clustering. As a response to this issue, multiple clustering that aims to generate diverse and high-quality clustering, has emerged recently. This study proposes a novel autoencoder-like semi-nonnegative <a class="topic-link" href="/topics/computer-science/matrix-factorization">matrix factorization</a> (NMF) multiple clustering (ASNMFMC) model that generates multiple non-redundant, high-quality clustering. The nonnegative </span><a class="topic-link" href="/topics/mathematics/sigma-property">property</a> of the semi-NMF is utilized by the algorithm to enforce non-redundancy. Extensive experimental results demonstrate that the ASNMFMC is superior to the existing multiple clustering methods and can explore diverse high-quality clustering.</span></p>
computer science, information systems
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to address the following issues: 1. **Diversity and High-Quality Clustering**: - Many real-world applications require different clustering methods, but most traditional clustering methods can only produce a single clustering result. - To tackle this challenge, the study proposes multiple clustering methods to generate diverse and high-quality clustering results. 2. **Redundancy Control**: - The redundancy between multiple clustering results needs to be controlled to prevent high similarity between the results. - The study proposes a novel autoencoder-based semi-nonnegative matrix factorization (semi-NMF) multiple clustering model (ASNMFMC), which quantifies the statistical dependency between feature projection matrices using the Hilbert-Schmidt Independence Criterion (HSIC) and controls the redundancy between different clusterings through co-association information. 3. **Data Type Handling**: - Existing multiple clustering methods usually can only handle data containing non-negative values. - ASNMFMC can handle data containing negative values, expanding its applicability. Through these methods, the ASNMFMC model proposed in the paper can achieve diverse clustering results while ensuring clustering quality.