Clustering through decision tree construction

Bing Liu, Yiyuan Xia, Philip S Yu
2000-11-06
Abstract:Clustering aims to find the intrinsic structure of data by organizing data objects into similarity groups or clusters. It is often called unsupervised learning as no class labels denoting an a priori partition of the objects are given. This is in contrast with supervised learning (eg, classification) for which the data objects are already labeled with known classes. Past research in clustering has produced many algorithms. However, these algorithms have some major shortcomings. In this paper, we propose a novel clustering technique, which is based on a supervised learning technique called decision tree construction. The new technique is able to overcome many of these shortcomings. The key idea is to use a decision tree to partition the data space into cluster and empty (sparse) regions at different levels of details. The technique is able to find" natural" clusters in large high dimensional spaces efficiently. It is suitable for …
What problem does this paper attempt to address?