Abstract:With the rapid increasement of the amount and dimensionality of data,high-dimension data processing has become the key and difficult point of cluster analysis in the age of big data.Subspace clustering is an important method in the field of high-dimension data clustering on account of the fact that data in a class or category lie in a low-dimension subspace of the ambient space.Sparse Space Clustering(SSC)proposed by Elhamifar discovers the sparse representations of data distributed in a union of low-dimension subspaces.SSC solves the sparse self-expression coefficient of data matrix constrained by L 1 norm via Alternating Direction Method of Multipliers (ADMM)and establishes the Laplacian matrix of the data.Then,the data are classified into specific categories via special clustering algorithm.However,ADMM has too many parameters to optimalize and slow convergence speed. These disadvantages make SSC far from dealing with large scale datasets efficiently.In consideration of these problems,we propose a sparse subspace clustering algorithm based on L 0 constraint in this paper.The proposed method solves the sparse self-expression reconstruction problem constrained by L 0 norm through Orthogonal Matching Pursuit(OMP).OMP finds the sparse represent of each data point as a linear combination of other data points in a direct and efficient way.The sparse self-expression coefficient acquired by OMP is transformed into similarity matrix.Ultimately similarity matrix is applied by spectral clustering to obtain the clustering result.In order to further decrease the computation complexity of OMP,we also optimize OMP according to the relativity in continuous iterations and improve the efficiency of our algorithm.Experiments on synthetic data and Extended Yale B database demonstrate that the proposed L 0 constrained sparse subspace clustering is significantly more efficient while the accuracy is comparable to SSC.

Sparse Poisson coding for high dimensional document clustering

Document Clustering Using Locality Preserving Indexing

Efficient Probabilistic Latent Semantic Analysis with Sparsity Control

A parsimonious family of multivariate Poisson-lognormal distributions for clustering multivariate count data

Co-Clustering With Manifold And Double Sparse Representation

Data-Dependent Sparsity for Subspace Clustering.

Tensor LRR and Sparse Coding-Based Subspace Clustering.

Sparse Subspace Clustering Using Square-Root Penalty

Adaptive Lasso and group-Lasso for functional Poisson regression

Sparse encoding for more-interpretable feature-selecting representations in probabilistic matrix factorization

Subspace Clustering of Very Sparse High-Dimensional Data

Sparse Poisson Regression with Penalized Weighted Score Function

Research on Clustering Performance of Sparse Subspace Clustering

The Sparse Poisson Means Model

Model-based Clustering with Sparse Covariance Matrices

Efficient Sparsity Estimation Via Marginal-Lasso Coding.

Large-Scale Sparse Principal Component Analysis with Application to Text Data

Regularized bi-directional co-clustering

A Submodule Clustering Method for Multi-way Data by Sparse and Low-Rank Representation.

Sparsity Based Poisson Denoising with Dictionary Learning

Sparse subspace clustering based on L0 constraint