Large Margin Clustering on Uncertain Data by Considering Probability Distribution Similarity

Lei Xu,Qinghua Hu,Edward Hung,Baowen Chen,Xu Tan,Changrui Liao
DOI: https://doi.org/10.1016/j.neucom.2015.02.002
IF: 6
2015-01-01
Neurocomputing
Abstract:In this paper, the problem of clustering uncertain objects whose locations are uncertain and described by probability density functions (pdf) is studied. Though some existing methods (i.e. K-means, DBSCAN) have been extended to handle uncertain object clustering, there are still some limitations to be solved. K-means assumes that the objects are described by reasonably separated spherical balls. Thus, UK-means based on K-means is limited in handling objects which are in non-spherical shape. On the other hand, the probability density function is an important characteristic of uncertain data, but few existing clustering methods consider the difference between objects relying on probability density functions. Therefore, in this article, a clustering algorithm based on probability distribution similarity is proposed. Our method aims at finding the largest margin between clusters to overcome the limitation of UK-means. Extensively experimental results verify the performance of our method by effectiveness, efficiency and scalability on both synthetic and real data sets.
What problem does this paper attempt to address?