An Enhanced Stream Clustering Algorithm Based on Affinity Propagation
Jianlong ZHAO,Hua QU,Jihong ZHAO,Dingchao JIANG
DOI: https://doi.org/10.7652/xjtuxb201703018
2017-01-01
Abstract:Aiming at the problem that the traditional stream clustering algorithm cannot effectively deal with the inspection and treatment of outliers,and the incremental data stream clustering efficiency is low,an enhanced stream clustering algorithm based on affinity propagation using density measurement was proposed.Based on the STRAP,the proposed algorithm can improve the clustering accuracy and efficiency by introducing a mechanism for outlier detection and removal.Firstly,the online stream clustering process is realized by the affinity propagation algorithm.Meanwhile,the phenomenon of data drift is detected,i.e.,the distribution of data stream changes with time.In view of this phenomenon,the new algorithm can implement the outlier detection and removal in the reservoir based on local outlier factor,and then re-cluster the current cluster and the treated reservoir to reconstruct the dynamic stream clustering model.Finally,through the validation on the KDD'99 data,the experimental results showed that the proposed algorithm not only reduces the number of re-clustering and improves the clustering efficiency,but also is superior to the STRAP in terms of the three clustering evaluation criteria,i.e.,the clustering accuracy,purity and entropy.