Abstract:While recognizing the significance of data in machine learning, we focus on addressing the challenge of concept drift, particularly in dynamic data streams. We propose an innovative incremental decision tree algorithm tailored for learning regression trees and model trees from evolving data streams. Vital to ensuring the quality and accuracy of predictive models is addressing this challenge. In this context, we present a novel solution: an incremental decision tree algorithm tailored for learning regression trees and model trees from time-varying data streams. Our algorithm is designed to operate at high speeds, effectively accommodating the influx of data at any scale, including scenarios with potentially unlimited data. Key innovations of our approach include a probabilistic defined sampling strategy that enhances the learning process and an advanced automatic method capable of handling non-stationary data distributions. However, the primary innovation lies in our methodology for detecting concept drift. Unlike conventional methods that reactively respond to increased errors, we introduce a proactive approach: monitoring the quality of individual subtrees by tracking their error evolution. This method allows us to detect changes in the objective function promptly, leading to timely adaptations in the model structure. Through extensive experimentation and evaluation, we demonstrate the effectiveness of our proposed algorithm in terms of prediction accuracy, model size, and change detection capabilities. Representing a significant advancement in the field of machine learning, particularly in addressing the challenge of concept drift in data streams, the proposed algorithm offers a competitive alternative to existing flow classifiers. Showcasing superior performance in terms of precision, recall, Fisher measure, and scalability, it underscores its potential to enhance decision-making processes across various domains by adapting swiftly to changing data patterns and maintaining high accuracy. The algorithm's innovative approach to incremental learning of decision rules, coupled with its adaptive extension for handling concept drift, holds promise for real-world applications where accurate and timely insights are paramount. Overall, the algorithm's robustness, adaptability, and efficiency position it as a valuable asset in stream data classification and decision support systems.

Efficient Decision Tree for Evolving Data Streams Based on Frequent Patterns

New algorithm for online classification over data streams based on max-frequency patterns

A New Algorithm for Frequency Tendency Prediction over Data Streams

Approximate mining of global closed frequent itemsets over data streams

Mining Fuzzy Association Rules in Data Streams

Mitigating concept drift in data streams: an incremental decision tree approach

Constructing Decision Trees for Mining High-Speed Data Streams

Estimation and maintenance of frequent pattern on data streams

Mining Uncertain Data Streams Using Clustering Feature Decision Trees

Learning from Distribution-Changing Data Streams Via Decision Tree Model Reuse

Efficient Discovery of Emerging Frequent Patterns in ArbitraryWindows on Data Streams

Hybrid Forest: A Concept Drift Aware Data Stream Mining Algorithm

State-of-the-art on Frequent Pattern Mining in Data Streams

Incremental classification using Feature Tree

CBDT: A Concept Based Approach to Data Stream Mining

An Ensemble Classifier Algorithm for Mining Data Streams Based on Concept Drift

An Ensemble Classifier Method for Classifying Data Streams with Recurrent Concept Drift.

Mining Scalable Pattern Based on Temporal Logic over Data Streams

Data Stream Concept Drift Detection Method Based on Mixture Ensemble Method

Hoeffding adaptive trees for multi-label classification on data streams

Discovering an Evolutionary Classifier over a High-speed Nonstatic Stream