Categorical Clustering by Converting Associated Information

Dongmin Cai,Stephen S-T Yau
DOI: https://doi.org/10.5281/zenodo.1075769
2006-01-01
Abstract:Lacking an inherent "natural" dissimilarity measure between objects in categorical dataset presents special difficulties in clustering analysis. However, each categorical attributes from a given dataset provides natural probability and information in the sense of Shannon. In this paper, we proposed a novel method which heuristically converts categorical attributes to numerical values by exploiting such associated information. We conduct an experimental study with real-life categorical dataset. The experiment demonstrates the effectiveness of our approach.
What problem does this paper attempt to address?