Clustering Interval and Triangular Granular Data: Modeling, Execution, and Assessment
Yiming Tang,Wenbin Wu,Witold Pedrycz,Jianwei Gao,Xianghui Hu,Zhaohong Deng,Rui Chen
DOI: https://doi.org/10.1109/tnnls.2024.3499996
IF: 14.255
2024-01-01
IEEE Transactions on Neural Networks and Learning Systems
Abstract:In current granular clustering algorithms, numeric representatives were selected by users or an ordinary strategy, which seemed simple; meanwhile, weight settings for granular data could not adequately express their structural characteristics. Aiming at these problems, in this study, a new scheme called a granular weighted kernel fuzzy clustering (GWKFC) algorithm is put forward. We propose the representative selection and granularity generation (RSGG) algorithm enlightened by the density peak clustering (DPC) algorithm. We build interval and triangular granular data on the strength of numeric representatives obtained by RSGG under the principle of justifiable granularity (PJG), in which we establish some combinations of functions and boundary constraints and prove their properties. Furthermore, we present a novel distance formula via the kernel function for granular data and design new weights to affect the coverage and specificity of granular data. In addition, based upon these factors, we come up with the GWKFC algorithm of granular clustering, and its performance with different granularity is assessed. To sum up, a macro framework involving granular modeling, granular clustering, and assessment has been set up. Lastly, the GWKFC algorithm and ten other granular clustering algorithms are compared by experiments on some artificial and UCI datasets together with datasets with large data or those of high dimensionality. It is found that the GWKFC algorithm can provide better granular clustering results by contrast with other algorithms. The originality is embodied as follows. First, we improve the previous density radius and present the RSGG algorithm to acquire numeric representatives. Second, we propose a new strategy to determine granular data boundaries and further obtain novel weights enlightened by the idea of volume. Lastly, we employ the kernel function to calculate the distance between granular data, which has a stronger spatial division ability than the previous Euclidean distance.