A clustering algorithm for parallel coordinates-based measure model and its applications
Hu Jun,Huang Hou-Kuan,Gao Fang
DOI: https://doi.org/10.3321/j.issn:0469-5097.2009.05.011
2009-01-01
Abstract:To apply visualization to data mining,or to establish visible data mining method is a cross research subject about visualization and data mining.This type of research requires to be established above reasonable acknowledge basement.On one hand,it requires to analyse the theory and technology basement of this method;on the other hand,it also requires to consider the visualization character of the property of the data mining subject and the observer's awareness of visualization character.Two aspects need to be considered during applying visualization to cluster analysis method: one is the separability of the cluster algorithm process,that is,to split the process of the cluster algorithm doesn't affect the result of the cluster;the other is determining the key factors in the cluster algorithm and measuring standard,and then finding out their influences on the result of the cluster.The K-means algorithm chooses the expected number of the cluster centers in the dataset,and alters the centers,to find the minimum variance in the whole cluster,so as to find out the cluster and cluster center of the dataset.There are several key parameters in the K-means algorithm,and they have key effect on the result of the data mining,so it is able to determine the object of the visualization basing on the feature of the K-means algorithm and the parameters.To explain the visualization result of multiple-dimension data in a better way,the process of the visualization's objects should match with the process of the data objects,so it's able to draw necessary measureing index into visualization application.To draw into proper measure index in quantification contributes to improving the visualization technology,designing applicable measuring index,and establishing functional evaluation model.The application of the data mining algorithm based on visible measuring index provides a visible data mining method.In the process of the clustering,visualization technology contributes to finding out the clustering character,helping determining the cluster's K value and seed.To use measuring index as an evaluating index,and improving the clustering parameter and process,can improve the result of clustering.In this paper,the characteristic and method of visualization techniques applications are analyzed.The method of determining visualization data object and resolution of data mining algorithm is proposed.The paper proposes a parallel coordinates-based clustering algorithm visualization approach and measure model.Finally,it gives an application approach with K-means algorithm.The results show that the methods and parallel coordinates-based visual operation are simple and valid for visualization of data and clustering mining algorithm K-means.