Variational learning of hierarchical infinite generalized Dirichlet mixture models and applications

Wentao Fan,Hassen Sallay,Nizar Bouguila,Sami Bourouis
DOI: https://doi.org/10.1007/s00500-014-1557-5
IF: 3.732
2014-12-16
Soft Computing
Abstract:Data clustering is a fundamental unsupervised learning task in several domains such as data mining, computer vision, information retrieval, and pattern recognition. In this paper, we propose and analyze a new clustering approach based on both hierarchical Dirichlet processes and the generalized Dirichlet distribution, which leads to an interesting statistical framework for data analysis and modelling. Our approach can be viewed as a hierarchical extension of the infinite generalized Dirichlet mixture model previously proposed in Bouguila and Ziou (IEEE Trans Neural Netw 21(1):107–122, 2010). The proposed clustering approach tackles the problem of modelling grouped data where observations are organized into groups that we allow to remain statistically linked by sharing mixture components. The resulting clustering model is learned using a principled variational Bayes inference-based algorithm that we have developed. Extensive experiments and simulations, based on two challenging applications namely images categorization and web service intrusion detection, demonstrate our model usefulness and merits.
computer science, artificial intelligence, interdisciplinary applications
What problem does this paper attempt to address?