Interval Kernel Fuzzy C-Means Clustering of Incomplete Data.

Tianhao Li,Liyong Zhang,Wei Lu,Hui Hou,Xiaodong Liu,Witold Pedrycz,Chongquan Zhong
DOI: https://doi.org/10.1016/j.neucom.2017.01.017
IF: 6
2017-01-01
Neurocomputing
Abstract:In the clustering of incomplete data, the processing of missing attribute values and the optimization procedure of clustering are always of concern. In this paper, a novel clustering method is proposed to cope with incomplete data. Owing to the uncertainty of missing values, we first estimate these values in the form of intervals using the nearest neighbor method, which utilizes information about the distribution of data and transforms incomplete data set into an interval-valued one. Then, a kernel method is introduced to increase the separability between data by implicitly mapping them into a higher dimensional feature space, in which a kernel-induced distance is used to replace the Euclidean distance so that the data can be processed in the original data space. We realize the kernel clustering of incomplete data set by means of a gradient-based alternating optimization of interval data clustering based on the interval kernel distance. Finally, the experimental results demonstrate that the proposed approach is superior in terms of its clustering performance.
What problem does this paper attempt to address?