Using Visualization to Improve Clustering Analysis on Heterogeneous Information Network.
Wenbo Wang,Yuwei Li,Feng Wang,Xiaopei Liu,Youyi Zheng
DOI: https://doi.org/10.1109/iv.2018.00046
2018-01-01
Abstract:The exploration and analysis of data mining methodologies is an important task for effective knowledge discovery, especially in today's heterogeneous information networks. Previously presented approaches for mining optimization aim primarily at the improvements of time complexity, space complexity, accuracy, and robustness. We extend the state-of-the-art method by concentrating on user-availability and algorithm understandability. Specifically, we use Rankclus, a classic clustering algorithm as an example. After uncovering the unseen computing processes to be displayed in a visual form, the whole clustering processes are transparent to the users, which may help them more clearly and quickly understand how the algorithms are computed, how does each object influence one another. In addition, we use a density approach to intuitively simplify the discovery of data patterns, and through the visualized results, users can adjust algorithm parameters with or without professional training. Finally, we use another two visual techniques to improve the visualization quality: a heatmap matrix designed for checking the similarities of objects which are in the same cluster, and a DOItree implemented to further analyze the accuracy of the algorithms.