ON IMPROVEMENT AND ANALYSIS OF HIERARCHICAL CLUSTERING ALGORITHM

Guo Xiaojuan,Liu Xiaoxia,Li Xiaoling
DOI: https://doi.org/10.3969/j.issn.1000-386X.2008.06.098
2008-01-01
Abstract:A prominent and useful class of algorithm is hierarchical agglomerative clustering(HAC)which iteratively agglomerates the closest pare until all data points belong to one cluster.However,HAC methods have several drawbacks,such as high time and memory complexities when clustering,insufficient and inaccurate cluster validation,etc.Empirical study shows that most HAC algorithms follow a trend where,except for a number of top levels of the dendrogram,all lower level agglomerate clusters are very small in size and close in proximity to other clusters.Methods are proposed to reduce the time and memory complexities significantly and to make validation very efficient and accurate.Analysis and experiments all prove the effectiveness of the proposed method.
What problem does this paper attempt to address?