Grouped Text Clustering Using Non-Parametric Gaussian Mixture Experts

Yong Tian,Yu Rong,Yuan Yao,Weidong Liu,Jiaxing Song
DOI: https://doi.org/10.1007/978-3-319-42911-3_42
2016-01-01
Abstract:Text clustering has many applications in various areas. Before being clustered, texts often have already been grouped or partially grouped in practise. Texts from the same group are related to each other and concentrate on a few topics. The group information turns out to be valuable for text clustering. In this paper, we propose a model called Non-parametric Gaussian Mixture Experts to get better clustering result through utilizing group information. After converting texts to vectors by semantic embedding, our model can automatically infer proper cluster number for every group and the whole corpus. We develop an online variational inference algorithm which is scalable and can handle incremental datasets. Our algorithm is tested on various text datasets. The results demonstrate our model has significantly better performance in cluster quality than some other classical and recent text clustering methods.
What problem does this paper attempt to address?