Abstract:As an unsupervised pattern classification method, clustering partitions the input datasets into groups or clusters. It plays an important role in identifying the natural structure of the target datasets. Now, it has been widely used in data mining, pattern recognition, image processing and so on. However, due to different settings of the parameters and random selection of initial centers, traditional clustering algorithms may produce different clustering partitions for a single dataset. Clustering validity index (CVI) is an important method for evaluating the effect of clustering results generated by clustering algorithms. However, many of the existing CVIs suffer from complex computation, low time efficiency and narrow range of applications. In order to make clustering algorithms more stable, traditional K -means is firstly improved by the density parameters based initial center selection method other than randomly selecting initial centers. Then, in order to enlarge the application range of clustering and better evaluate the clustering partition results, a new variance based clustering validity index (VCVI) from the point of view of spatial distribution of datasets is designed. Finally, a new partitional clustering algorithm integrated with the improved K -means algorithm and the newly introduced VCVI is designed to optimize and determine the optimal clustering number (Kopf) for a wide range of datasets. Furthermore, the commonly used empirical rule Kmax 5.,/T -t is reasonably explained by the newly designed VCVI. The new algorithm integrated with VCVI is compared with traditional algorithms integrated with five commonly used CVIs. The experimental results show that our new clustering method is more accurate and stable while consuming relatively lower running time. (C) 2018 Elsevier B.V. All rights reserved.

An Adaptive Initial Cluster Centers Selection Algorithm for High-Dimensional Partition Clustering

Adaptive Dimension Reduction for Clustering High Dimensional Data

Improved Initial Cluster Center Selection in K-Means Clustering

Improved Initial Classes Partition Method of C-means Algorithm

A New Partitioning Based Algorithm for Document Clustering.

Finding Good Initial Cluster Center by Using Maximum Average Distance.

A Novel Algorithm for Initializing Clustering Centers

An Improved Initial Clustering Center Selection Method for K-Means Algorithm

New Initialization Method for Cluster Center

A composite neighbor-based algorithm for initializing cluster centers

Adaptive Initialization Method for K-means Algorithm

An Effective Partitional Clustering Algorithm Based on New Clustering Validity Index

Clustering by Defining and Merging Candidates of Cluster Centers Via Independence and Affinity.

A Unifying Family of Data-Adaptive Partitioning Algorithms

Subspace Clustering by Directly Solving Discriminative K-means

Cluster Center Initialization and Outlier Detection Based on Distance and Density for the K-Means Algorithm

The Optimal Initial Centers Clustering Algorithm Based on Local Outlier Factor

A Heuristic Initialization-Independent Spectral Clustering

Density Based Initial Center Optimization Algorithm

Nonuniform Sparse Data Clustering Cascade Algorithm Based on Dynamic Cumulative Entropy

A New Method of Selecting K-means Initial Cluster Centers Based on Hotspot Analysis