Multitask Bregman Clustering

Jianwen Zhang,Changshui Zhang
DOI: https://doi.org/10.1016/j.neucom.2011.02.004
2010-01-01
Proceedings of the AAAI Conference on Artificial Intelligence
Abstract:Traditional clustering methods deal with a single clustering task on a single data set. In some newly emerging applications, multiple similar clustering tasks are involved simultaneously. In this case, we not only desire a partition for each task, but also want to discover the relationship among clusters of different tasks. It is also expected that utilizing the relationship among tasks can improve the individual performance of each task. In this paper, we propose general approaches to extend a wide family of traditional clustering models/algorithms to multitask settings. We first generally formulate the multitask clustering as minimizing a loss function composed of a within-task loss and a task regularization. Then based on the general Bregman divergences, the within-task loss is defined as the average Bregman divergence from a data sample to its cluster centroid. And two types of task regularizations are proposed to encourage coherence among clustering results of tasks. Afterwards, we further provide a probabilistic interpretation to the proposed formulations from a viewpoint of joint density estimation. Finally, we propose alternate procedures to solve the induced optimization problems. In such procedures, the clustering models and the relationship among clusters of different tasks are updated alternately, and the two phases boost each other. Empirical results on several real data sets validate the effectiveness of the proposed approaches.
What problem does this paper attempt to address?