Abstract:Subspace clustering methods have been widely employed in many fields involved in high-dimensional data clustering and attracted more and more attentions.Subspace clustering method is a clustering analysis technique with feature selection and can achieve better performances by selecting a subset of salient features and performing clustering on the low-dimensional representation of the high-dimensional data.In many practical applications,it is known that soft clustering can provide more meaningful partition of complex data than hard clustering.In this paper,we extend the k-means clustering and present a novel reliability-based regularized weighted soft k-means clustering algorithm(RRWSKM).The method can calculate the contribution of each dimension in each cluster and find different subsets of salient dimensions relevant to different clusters.Furthermore,it can also identify the exact data patterns by tuning model parameters and exhibit good performance.These are achieved by incorporating dimension weight entropy and partition entropy terms as regularizations into the objective function to avoid overfitting and stimulate more dimensions to contribute to identify the clusters.In addition,the reliability of dimension weights is retained by exploiting the data reliability measure,and the initial dimension weights can be determined,enhancing the performances and robustness of the proposed algorithm greatly.Since the optimization problem of RRWSKM is non-convex,the optimal solution is achieved by solving the optimization problem through an iterative update formulations.Some experiments on real-world data sets are conducted to verify the novel algorithm.The results of the experiments showed that the proposed method can exhibit the low-dimensionality representations of high-dimensional data and achieve better clustering performances than other subspace clustering methods and can handle with the high-dimensional data well.

A feature group weighting method for subspace clustering of high-dimensional data

An Entropy Weighting K-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data

Reliability-based regularized weighted soft k-means algorithmfor subspace clustering

Subspace clustering of text documents with feature weighting k-means algorithm

An Improved K-Means Clustering Algorithm Based on Feature Weighting

Subspace Clustering by Directly Solving Discriminative K-means

Fast Adaptive K-Means Subspace Clustering for High-Dimensional Data

On the performance of feature weighting K-means for text subspace clustering

Weighted Subspace Fuzzy Clustering with Adaptive Projection

An Entropy Weighting Mixture Model for Subspace Clustering of High-Dimensional Data

Multi-feature Weighting Neighborhood Density Clustering

Silhouette coefficient-based weighting k-means algorithm

A Novel Intelligent Clustering Approach For High Dimensional Data In A Big Data Environment

Fast and robust K-means clustering via feature learning on high-dimensional data

Optimization of K-Modes Algorithm with Feature Weights

Adaptive Multi-View Subspace Clustering for High-Dimensional Data

Fuzzy K-Means with Variable Weighting in High Dimensional Data Analysis

DSKmeans: A new kmeans-type approach to discriminative subspace clustering

A new robust fuzzy clustering framework considering different data weights in different clusters

A New Feature Weighted Fuzzy Clustering Algorithm

Enhanced Subspace Clustering Through Combining Minkowski Distance and Cosine Dissimilarity.