Extreme clustering – A clustering method via density extreme points

Shuliang Wang,Qi Li,Chuanfeng Zhao,Xingquan Zhu,Hanning Yuan,Tianru Dai
DOI: https://doi.org/10.1016/j.ins.2020.06.069
IF: 8.1
2021-01-01
Information Sciences
Abstract:<p>Peak clustering, a density based clustering method, has shown remarkable performance in clustering analysis of data. In reality, peak clustering suffers from two major drawbacks: (i) when the difference in cluster sample density is significant, it becomes difficult for peak clustering to find cluster centres in low density clusters. (ii) in some cases, it will incorrectly detect many normal points as noises. In this paper, we propose a new extreme clustering method to overcome the drawbacks of peak clustering. The theme of extreme clustering is to identify density extreme points to find cluster centres. In addition, a noise detection module is also introduced to identify noisy data points from the clustering results. As a result, the extreme clustering is robust to datasets with different density distributions. Experiments and validations, on over 40 datasets, show that extreme clustering can not only inherit the cluster validity of peak clustering, but also overcome its shortages with significant performance gain. Case studies on real-world haze analysis also demonstrate the performance of extreme clustering method in finding some main haze origins in a Chinese city.</p>
computer science, information systems
What problem does this paper attempt to address?