EM-IFCM: Fuzzy c-means clustering algorithm based on edge modification for imbalanced data

Yue Pu,Wenbin Yao,Xiaoyong Li
DOI: https://doi.org/10.1016/j.ins.2023.120029
IF: 8.1
2023-12-01
Information Sciences
Abstract:The improved fuzzy c-means (IFCM) algorithm is an effective technique for handling the “uniform effect” in imbalanced data clustering; it adjusts the weight of each class based on the fuzzy size between clusters. However, the IFCM algorithm produces a “siphon effect” as the imbalance rate increases. It misclassifies the samples in small classes into large ones. Our analysis shows that this effect occurs because all samples have the same weight value of the same classes, the membership values are polarized, resulting in the model failing to converge to the correct interval. Thus, we propose an imbalanced fuzzy c-means clustering based on edge modification (EM-IFCM) algorithm to alleviate the “siphon effect” of the IFCM algorithm. It exhibits stronger inter-class separability by dynamically adjusting the weight of the samples to enhance the influence of edge samples on the model. In addition, we analyze the effectiveness and complexity of the algorithm and proved its convergence. Finally, we conduct extensive experiments on synthesis, machine-learning, and image-segmentation datasets and compare the results with those of six algorithms. The experimental results show that EM-IFCM has higher accuracy and exhibits an imbalance rate that is at least 1.94 times higher than that of the other algorithms.
computer science, information systems
What problem does this paper attempt to address?