Abstract:As an extension of non-negative matrix factorization (NMF), graph-regularized non-negative matrix factorization (GNMF) has been widely applied in data mining and machine learning, particularly for tasks such as clustering and feature selection. Traditional GNMF methods typically rely on predefined graph structures to guide the decomposition process, using fixed data graphs and feature graphs to capture relationships between data points and features. However, these fixed graphs may limit the model’s expressiveness. Additionally, many NMF variants face challenges when dealing with complex data distributions and are vulnerable to noise and outliers. To overcome these challenges, we propose a novel method called sparse feature-weighted double Laplacian rank constraint non-negative matrix factorization (SFLRNMF), along with its extended version, SFLRNMTF. These methods adaptively construct more accurate data similarity and feature similarity graphs, while imposing rank constraints on the Laplacian matrices of these graphs. This rank constraint ensures that the resulting matrix ranks reflect the true number of clusters, thereby improving clustering performance. Moreover, we introduce a feature weighting matrix into the original data matrix to reduce the influence of irrelevant features and apply an L2,1/2 norm sparsity constraint in the basis matrix to encourage sparse representations. An orthogonal constraint is also enforced on the coefficient matrix to ensure interpretability of the dimensionality reduction results. In the extended model (SFLRNMTF), we introduce a double orthogonal constraint on the basis matrix and coefficient matrix to enhance the uniqueness and interpretability of the decomposition, thereby facilitating clearer clustering results for both rows and columns. However, enforcing double orthogonal constraints can reduce approximation accuracy, especially with low-rank matrices, as it restricts the model’s flexibility. To address this limitation, we introduce an additional factor matrix R, which acts as an adaptive component that balances the trade-off between constraint enforcement and approximation accuracy. This adjustment allows the model to achieve greater representational flexibility, improving reconstruction accuracy while preserving the interpretability and clustering clarity provided by the double orthogonality constraints. Consequently, the SFLRNMTF approach becomes more robust in capturing data patterns and achieving high-quality clustering results in complex datasets. We also propose an efficient alternating iterative update algorithm to optimize the proposed model and provide a theoretical analysis of its performance. Clustering results on four benchmark datasets demonstrate that our method outperforms competing approaches.

Spectral Clustering of High-Dimensional Data Via Nonnegative Matrix Factorization

Non-negative and Sparse Spectral Clustering

Global discriminative-based nonnegative spectral clustering.

Robust Matrix Factorization with Spectral Embedding.

Attention Non-Negative Spectral Clustering

Spectral-Spatial Constrained Nonnegative Matrix Factorization for Spectral Mixture Analysis of Hyperspectral Images

Spectral clustering of high-dimensional data exploiting sparse representation vectors

Subspace Clustering Guided Convex Nonnegative Matrix Factorization.

Clustering High-Dimensional Data via Spectral Clustering Using Collaborative Representation Coefficients.

Document Clustering Based on Spectral Clustering and Non-negative Matrix Factorization

Graph Regularized Sparse Non-Negative Matrix Factorization for Clustering

Sparse Feature-Weighted Double Laplacian Rank Constraint Non-Negative Matrix Factorization for Image Clustering

On the Equivalence of Nonnegative Matrix Factorization and K-means - Spectral Clustering

Sparse Hyper-graph Non-negative Matrix Factorization by Maximizing Correntropy

Fast Nonnegative Matrix Tri-Factorization for Large-Scale Data Co-Clustering.

Entropy Regularized Fuzzy Nonnegative Matrix Factorization for Data Clustering

Sparse P-Norm Nonnegative Matrix Factorization for Clustering Gene Expression Data.

A Nonlinear Orthogonal Non-Negative Matrix Factorization Approach to Subspace Clustering

Dual Graph Regularized Sparse Nonnegative Matrix Factorization For Data Representation

General Subspace Constrained Non-Negative Matrix Factorization for Data Representation

Two-Dimensional Semi-Nonnegative Matrix Factorization for Clustering