A Clustering Based On Information Granularity For High Dimensional Sparse Data

Yaqin Zhao,Xianzhong Zhou
DOI: https://doi.org/10.1109/GRC.2005.1547305
2005-01-01
Abstract:This paper presents an information granularity-based clustering algorithm that proceeds from smaller granules to larger granules. Initial clustering is performed directly and simply by comparing whether two equivalence relations are equal, not computing the intersection of equivalence class as usual. Secondary clustering result is based on fuzzy granularity. The objects of fuzzy clustering are not original data, but some larger granules (initial clusters). High dimensional sparse data is effectively compressed and expressed as sparse feature vector whose dimension is far lower than the dimension of original data. As a result, our approach can handle an array of vastly high dimensional sparse data with high efficiency, and be independent of sequence of the objects.
What problem does this paper attempt to address?