Local Directional Centrality Clustering Based on K-nearest Neighbor Outlier Detection and Shared Neighborhood Strategy

Qing Liu,Zihang Feng,Liping Yan,Yuanqing Xia
DOI: https://doi.org/10.1109/ddcls61622.2024.10606710
2024-01-01
Abstract:Clustering, as an important prerequisite method in data analysis, can uncover potential information in the data and then proceed to the next step of data analysis and processing. The recently proposed boundary-seeking Clustering algorithm using the local Direction Centrality (CDC) is a very effective method for clustering data with heterogeneous density and weak connectivity. However, it still has some shortcomings. On the one hand, the reachable distance is not enough to comprehensively distinguish weak connection situations, which can easily lead to cross cluster connection errors. On the other hand, K-nearest neighbor search is prone to cross cluster search, leading to misjudgment of boundary points and resulting in connection errors. This paper proposes a local direction centrality clustering algorithm based on K-nearest neighbor outlier detection and shared neighborhood strategy (SODCDC) for sparse and weakly connected data. This algorithm uses a K-nearest neighbor outlier detection strategy to relieve K-nearest neighbor cross cluster search and reduces the probability of boundary point misjudgment. At the same time, it uses a shared neighborhood strategy to further prevent cross cluster connections of weakly connected data. Experiments on some datasets have shown that compared to the original algorithm, the proposed algorithm performs better under the commonly used evaluation metrics.
What problem does this paper attempt to address?