An Unsupervised Learning Short Text Clustering Method

Zuhua Dai,Kelong Li,Hongyi Li,Xiaoting Li
DOI: https://doi.org/10.1088/1742-6596/1650/3/032090
2020-10-01
Journal of Physics: Conference Series
Abstract:Abstract Due to the continuous development of Natural Language Processing (NLP), the task of short text categorization has been paid more and more attention. In short text clustering, the high-dimensional sparseness of text representation matrix becomes a challenging problem. This paper proposes a deep embedded method for feature extraction and clustering allocation using auto encoder of sentence distributed embedding. This method maps from data space to low-dimensional feature space and iteratively optimizes clustering targets. Experimental results on three short Chinese text data sets verify the effectiveness of the method. Moreover, it is superior to the existing correlation clustering methods.
English Else
What problem does this paper attempt to address?