An Efficient Decision Tree Classification Method Based on Extended Hash Table for Data Streams Mining

Zhenzheng Ouyang,Quanyuan Wu,Tao Wang
DOI: https://doi.org/10.1109/FSKD.2008.481
2008-01-01
Abstract:This paper focuses on continuous attributes handling for mining data stream with concept drift. Data stream is an incremental, online and real time model. Domingos and Hulten have presented a one-pass algorithm. Their system VFDT use Hoeffding inequality to achieve a probabilistic bound on the accuracy of the tree constructed. VFDT’s extended version CVFDT handles concept drift efficiently. In this paper, we revisit this problem and implemented a system HashCVFDT on top of CVFDT. It is as fast as hash table when inserting, seeking or deleting attribute value, and it also can sort the attribute value.
What problem does this paper attempt to address?