A New Weight Based Density Peaks Clustering Algorithm for Numerical and Categorical Data

Wuning Tong,Yuping Wang,Junkun Zhong,Wei Yan
DOI: https://doi.org/10.1109/CIS.2017.00044
2017-01-01
Abstract:Discovering the potential group structure of objects is of crucial importance to data mining. Most of the existing clustering approaches are applicable only to purely numerical or categorical data, and only a few approaches can deal with both numerical and categorical attributes recently, however, these approaches often need higher computational cost. To cluster data with both numerical and categorical attributes efficiently, in this paper, we propose a new approach with the following schemes. First, a measure of the importance of each categorical attribute is designed and a method to generate the weight of each categorical attribute is proposed based on this measure. Then a unified distance metric is proposed by combining the distance for the numerical part and that for the categorical part with weights. Furthermore, combining the new weights into method in [1], an improved density peaks clustering algorithm is presented. Finally, the experimental results show the efficiency of the proposed approach.
What problem does this paper attempt to address?