Abstract:While recognizing the significance of data in machine learning, we focus on addressing the challenge of concept drift, particularly in dynamic data streams. We propose an innovative incremental decision tree algorithm tailored for learning regression trees and model trees from evolving data streams. Vital to ensuring the quality and accuracy of predictive models is addressing this challenge. In this context, we present a novel solution: an incremental decision tree algorithm tailored for learning regression trees and model trees from time-varying data streams. Our algorithm is designed to operate at high speeds, effectively accommodating the influx of data at any scale, including scenarios with potentially unlimited data. Key innovations of our approach include a probabilistic defined sampling strategy that enhances the learning process and an advanced automatic method capable of handling non-stationary data distributions. However, the primary innovation lies in our methodology for detecting concept drift. Unlike conventional methods that reactively respond to increased errors, we introduce a proactive approach: monitoring the quality of individual subtrees by tracking their error evolution. This method allows us to detect changes in the objective function promptly, leading to timely adaptations in the model structure. Through extensive experimentation and evaluation, we demonstrate the effectiveness of our proposed algorithm in terms of prediction accuracy, model size, and change detection capabilities. Representing a significant advancement in the field of machine learning, particularly in addressing the challenge of concept drift in data streams, the proposed algorithm offers a competitive alternative to existing flow classifiers. Showcasing superior performance in terms of precision, recall, Fisher measure, and scalability, it underscores its potential to enhance decision-making processes across various domains by adapting swiftly to changing data patterns and maintaining high accuracy. The algorithm's innovative approach to incremental learning of decision rules, coupled with its adaptive extension for handling concept drift, holds promise for real-world applications where accurate and timely insights are paramount. Overall, the algorithm's robustness, adaptability, and efficiency position it as a valuable asset in stream data classification and decision support systems.

A Probabilistic Framework for Adapting to Changing and Recurring Concepts in Data Streams

Adaptive Ensemble Classification Algorithm for Data Streams Based on Information Entropy

Recurring Concept Meta-learning for Evolving Data Streams

Concept-drifting Data Streams are Time Series; The Case for Continuous Adaptation

Improving the performance of data stream classifiers by mining recurring contexts

Prototype-Based Learning On Concept-Drifting Data Streams

Concept Drift Detection and Adaptation with Weak Supervision on Streaming Unlabeled Data

Handling Adversarial Concept Drift in Streaming Data

Selective Prototype-Based Learning on Concept-Drifting Data Streams

A Better Algorithm for Classifying Data Streams with Concept Drifting

ADES: A New Ensemble Diversity-Based Approach for Handling Concept Drift

A comprehensive analysis of concept drift locality in data streams

Incremental Bayesian Classifier for Streaming Data with Concept Drift

Online Reliable Semi-supervised Learning on Evolving Data Streams

A comprehensive ensemble classification techniques detecting and managing concept drift in dynamic imbalanced data streams

Concept Evolution Detecting over Feature Streams

TS-DM: A Time Segmentation-Based Data Stream Learning Method for Concept Drift Adaptation

Learning Concept-Drifting Data Streams with Random Ensemble Decision Trees

Mitigating concept drift in data streams: an incremental decision tree approach

Learning Parameter Distributions to Detect Concept Drift in Data Streams

Random Ensemble Decision Trees for Learning Concept-Drifting Data Streams.