LD-IDH-Clu: A New Clustering Algorithm Based on the Local Density Estimation and an Improved Density Hierarchy Strategy
Jianfang Qi,Yue Li,Haibin Jin,Dong Tian,Weisong Mu
DOI: https://doi.org/10.1007/978-981-19-6901-0_78
2022-01-01
Abstract:Clustering is an important branch of data mining. In this study, we proposed a new clustering method, named LD-IDH-Clu, to overcome the problems such as sensitivity to noises, and unreasonable distribution of clustering results which exist in present hierarchical clustering algorithms. First, to solve the noise problem, a simple but effective local density estimation strategy is introduced to determine the representative points of clusters; and then the similarity between clusters is measured by calculating the distance between the representative points of clusters. Then, an improved density hierarchy scheme, IDH, is proposed to address the issue of unreasonable distribution. Compared with the existing density hierarchy (DH), the IDH can significantly reduce the running time while maintaining the advantage of DH. Finally, the performance of the LD-IDH-Clu was compared with some existing algorithms on eight UCI datasets, five 2-D synthetic datasets, and a Chinese wine market dataset. Experimental results show that the LD-IDH-Clu has better performance than the previous models in terms of Calinski-Harabasz (CH) Index, Adjusted Rand Index (ARI), Adjusted Mutual Information (AMI), V-measure, runtime, and distribution. Particularly, by applying the LD-IDH-Clu to segment the Chinese wine market, it is found that for an unfamiliar dataset, the LD-IDH-Clu algorithm can always yield some segments that are worthy of further analysis, regardless of the need to segment it into several sub-markets in a real-world scenario.