Distributed Latent Dirichlet allocation for objects-distributed cluster ensemble

Hongjun Wang,Zhishu Li,Yang Cheng
DOI: https://doi.org/10.1109/NLPKE.2008.4906792
2008-01-01
Abstract:The paper introduces the model of distributed latent Dirichlet location (D-LDA) for objects-distributed cluster ensemble which can handle the problems of privacy preservation, distributed computing and knowledge reuse. First, the latent variables in D-LDA and some terminologies are defined for cluster ensemble. Second, Markov chain Monte Carlo (MCMC) approximation inference for D-LDA is stated in detail. Third, some datasets from UCI are chosen for experiment, Compared with cluster-based similarity partitioning algorithm (CSPA), hyper-graph partitioning algorithm (HGPA) and meta-clustering algorithm (MCLA), the results show D-LDA does work better, furthermore the outputs of D-LDA, as a soft cluster model, can not only cluster the data points but also show the structure of data points.
What problem does this paper attempt to address?