Abstract:Applications in many domains such as text mining and natural language processing need to deal with high-dimensional data. High-dimensional data may present better clustering characteristics on a selected low-dimensional subspace. Subspace clustering is to project the data onto a low-dimensional subspace before clustering. Traditional subspace clustering methods employ eigenvalue decomposition to find the projection of the input data and perform K-means or kernel K-means to obtain the clustering matrix. This kind of methods is not only inefficient, but also adopts a two-step method to generate an approximate solution. Although Discriminative K-means (DisKmeans) integrates dimensionality reduction and clustering into a joint framework and solves the optimization problem by kernel K-means, such method needs to find the centroids in the kernel space and class labels iteratively and has a square time complexity. Accordingly, in this paper, we propose an algorithm, namely Fast DisKmeans (FDKM), to obtain the cluster indicator matrix in a direct way. Moreover, our proposed method has a linear time complexity, which is a significant reduction compared with the squared time complexity of DisKmeans. We also demonstrate that solving the object function of DisKmeans is equivalent to representing the cluster assignment matrix by a low-dimensional linear mapping of the data. Based on this observation, we propose the second algorithm, namely Iterative Fast DisKmeans (IFDKM), which also has a linear time complexity. A series of experiments were conducted on several datasets, and the experimental results showed the superior performance of FDKM and IFDKM.

BNAK-Divide-and-Merge Clustering Algorithm

An Improved Algorithm Based on Divide-and-Merge Clustering Algorithm

A Projection-Based Split-and-merge Clustering Algorithm.

A split–merge clustering algorithm based on the k-nearest neighbor graph

A Parallel Varied Density-Based Clustering Algorithm with Optimized Data Partition

A Novel Density Peaks Clustering Algorithm Based on K Nearest Neighbors with Adaptive Merging Strategy

An improved k-means algorithm based on density normalization

Subspace Clustering by Directly Solving Discriminative K-means

xk-split:A Split Clustering Algorithm Bases on k-medoids

A Split-Merge Framework for Comparing Clusterings

Multi-Prototypes Convex Merging Based K-Means Clustering Algorithm

Minimum Spanning Tree Based Split-and-merge: A Hierarchical Clustering Method

Cluster Merging and Splitting in Hierarchical Clustering Algorithms

Spectral Clustering of Large-scale Data by Directly Solving Normalized Cut.

Block-Based K-Medoids Partitioning Method with Standardized Data to Improve Clustering Accuracy

Divisive Hierarchical Clustering with K-means and Agglomerative Hierarchical Clustering

A Multi-Center Clustering Algorithm Based on Mutual Nearest Neighbors for Arbitrarily Distributed Data

A Graph Clustering Algorithm Providing Scalability

A Differential Evolution Algorithm With Adaptive Niching and K-Means Operation for Data Clustering

Normalized Tree Partitioning for Image Segmentation

Efficient Clustering Based on A Unified View of $K$-Means and Ratio-cut.