Nonnegative Latent Factor Analysis-Incorporated and Feature-Weighted Fuzzy Double $c$-Means Clustering for Incomplete Data

Zhengyu Zhu,Guisong Yang,Ming Li,Xin Luo,Yan Song
DOI: https://doi.org/10.1109/TFUZZ.2022.3144489
IF: 12.253
2022-10-01
IEEE Transactions on Fuzzy Systems
Abstract:Fuzzy <inline-formula><tex-math notation="LaTeX">$c$</tex-math></inline-formula>-means (FCM) clustering is a promising method to handle uncertainties in data clustering. However, the traditional FCM and most of its variants cannot address incomplete inputs. To this aim, a novel fuzzy clustering framework is put forward to perform highly accurate clustering on incomplete data. It adopts twofold ideas: 1) Utilizing a nonnegative latent factor model to prefill the missing data in the inputs by rigidly extracting involved entities’ latent features, where the principle of a minibatch gradient descent algorithm is incorporated into a single latent factor-dependent, nonnegative and multiplicative update algorithm to accelerate the convergence rate; and 2) integrating the distribution of inputs and the weights of local features into the objective function through sparse self-representation and weighting allocation to focus on crucial features. In this way, a NLF analysis-incorporated and feature-weighted fuzzy double <inline-formula><tex-math notation="LaTeX">$c$</tex-math></inline-formula>-means clustering (NF <inline-formula><tex-math notation="LaTeX">$^2$</tex-math></inline-formula> D) method is achieved, where the data distribution and instance correlation are simultaneously considered with care. Experiments on 12 real-world datasets including both data and images with different missing rates show that the proposed NF<inline-formula><tex-math notation="LaTeX">$^2$</tex-math></inline-formula>D method has a significant superiority over state-of-the-art fuzzy clustering methods.
Mathematics,Computer Science
What problem does this paper attempt to address?