DSP-TMM: A Robust Cluster Analysis Method Based on Diversity Self-Paced T-Mixture Model

Limin Pan,Xiaonan Qin,Senlin Luo
DOI: https://doi.org/10.15918/j.jbit1004-0579.20070
2020-01-01
Journal of Beijing Institute of Technology
Abstract:In order to implement the robust cluster analysis, solve the problem that the outliers in the data will have a serious disturbance to the probability density parameter estimation, and therefore affect the accuracy of clustering, a robust cluster analysis method is proposed which is based on the diversity self-paced t-mixture model. This model firstly adopts the t-distribution as the sub-model which tail is easily controllable. On this basis, it utilizes the entropy penalty expectation conditional maximal algorithm as a pre-clustering step to estimate the initial parameters. After that, this model introduces l2,1-norm as a self-paced regularization term and developes a new ECM optimization algorithm, in order to select high confidence samples from each component in training. Finally, experimental results on several real-world datasets in different noise environments show that the diversity self-paced t-mixture model outperforms the state-of-the-art clustering methods. It provides significant guidance for the construction of the robust mixture distribution model.
What problem does this paper attempt to address?