Explainable Impact of Partial Supervision in Semi-Supervised Fuzzy Clustering

Kamil Kmita,Katarzyna Kaczmarek-Majer,Olgierd Hryniewicz
DOI: https://doi.org/10.1109/tfuzz.2024.3370768
IF: 12.253
2024-01-01
IEEE Transactions on Fuzzy Systems
Abstract:Controlling the impact of partial supervision on the outcomes of modeling is of uttermost importance in semisupervised fuzzy clustering. Semi-Supervised Fuzzy C-Means (SSFCMeans), a specific model we consider, uses a single hyperparameter called a scaling factor α to weigh the impact of partially labeled data. This concept became widespread and was reused directly in many works building on SSFCMeans, or even applied to other fuzzy clustering algorithms such as Possibilistic C-Means. However, none of the works challenged the original interpretation of α which suggests that the impact of partial supervision is directly proportional to the scaling factor. We fill the above research gap and thoroughly analyze this relationship. We provide novel explanations of the scaling factor α in terms of the key element of fuzzy clustering - the membership values. We prove that the impact of partial supervision is a non-linear function of α. Our approach is rooted in the explainability framework, which distinguishes interpretation from an explanation and treats the latter as superior. Explaining the scaling factor leads to an explainable impact of partial supervision and enables greater control of it. Finally, built on the novel explanations, we propose a unified, analytically justified framework for selecting the value of the hyperparameter α that is based on the crossvalidation approach. We illustrate that the proposed framework enables an extensive analysis of the impact of partial supervision in SSFCMeans with a simulation experiment.
computer science, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?