A New Representation of Interval Symbolic Data and Its Application in Dynamic Clustering
Wenhua Li,Junpeng Guo,Ying Chen,Minglu Wang
DOI: https://doi.org/10.1007/s00357-016-9193-7
IF: 1.333
2016-02-23
Journal of Classification
Abstract:In this study, we consider the type of interval data summarizing the original samples (individuals) with classical point data. This type of interval data are termed interval symbolic data in a new research domain called, symbolic data analysis. Most of the existing research, such as the (centre, radius) and [lower boundary, upper boundary] representations, represent an interval using only the boundaries of the interval. However, these representations hold true only under the assumption that the individuals contained in the interval follow a uniform distribution. In practice, such representations may result in not only inconsistency with the facts, since the individuals are usually not uniformly distributed in many application aspects, but also information loss for not considering the point data within the intervals during the calculation. In this study, we propose a new representation of the interval symbolic data considering the point data contained in the intervals. Then we apply the city-block distance metric to the new representation and propose a dynamic clustering approach for interval symbolic data. A simulation experiment is conducted to evaluate the performance of our method. The results show that, when the individuals contained in the interval do not follow a uniform distribution, the proposed method significantly outperforms the Hausdorff and city-block distance based on traditional representation in the context of dynamic clustering. Finally, we give an application example on the automobile data set.
mathematics, interdisciplinary applications,psychology, mathematical