Nearest-Neighbour-Induced Isolation Similarity and Its Impact on Density-Based Clustering.

Xiaoyu Qin,Kai Ming Ting,Ye Zhu,Vincent C. S. Lee
DOI: https://doi.org/10.1609/aaai.v33i01.33014755
2019-01-01
Proceedings of the AAAI Conference on Artificial Intelligence
Abstract:A recent proposal of data dependent similarity called IsolationKernel/Similarity has enabled SVM to produce better classification accuracy. Weidentify shortcomings of using a tree method to implement Isolation Similarity;and propose a nearest neighbour method instead. We formally prove thecharacteristic of Isolation Similarity with the use of the proposed method. Theimpact of Isolation Similarity on density-based clustering is studied here. Weshow for the first time that the clustering performance of the classicdensity-based clustering algorithm DBSCAN can be significantly uplifted tosurpass that of the recent density-peak clustering algorithm DP. This isachieved by simply replacing the distance measure with the proposednearest-neighbour-induced Isolation Similarity in DBSCAN, leaving the rest ofthe procedure unchanged. A new type of clusters called mass-connected clustersis formally defined. We show that DBSCAN, which detects density-connectedclusters, becomes one which detects mass-connected clusters, when the distancemeasure is replaced with the proposed similarity. We also provide the conditionunder which mass-connected clusters can be detected, while density-connectedclusters cannot.
What problem does this paper attempt to address?