Abstract:This research proposes a novel drift detection methodology for machine learning (ML) models based on the concept of ''deformation'' in the vector space representation of data. Recognizing that new data can act as forces stretching, compressing, or twisting the geometric relationships learned by a model, we explore various mathematical frameworks to quantify this deformation. We investigate measures such as eigenvalue analysis of covariance matrices to capture global shape changes, local density estimation using kernel density estimation (KDE), and Kullback-Leibler divergence to identify subtle shifts in data concentration. Additionally, we draw inspiration from continuum mechanics by proposing a ''strain tensor'' analogy to capture multi-faceted deformations across different data types. This requires careful estimation of the displacement field, and we delve into strategies ranging from density-based approaches to manifold learning and neural network methods. By continuously monitoring these deformation metrics and correlating them with model performance, we aim to provide a sensitive, interpretable, and adaptable drift detection system capable of distinguishing benign data evolution from true drift, enabling timely interventions and ensuring the reliability of machine learning systems in dynamic environments. Addressing the computational challenges of this methodology, we discuss mitigation strategies like dimensionality reduction, approximate algorithms, and parallelization for real-time and large-scale applications. The method's effectiveness is demonstrated through experiments on real-world text data, focusing on detecting context shifts in Generative AI. Our results, supported by publicly available code, highlight the benefits of this deformation-based approach in capturing subtle drifts that traditional statistical methods often miss. Furthermore, we present a detailed application example within the healthcare domain, showcasing the methodology's potential in diverse fields. Future work will focus on further improving computational efficiency and exploring additional applications across different ML domains.

Towards AutoML in the Presence of Drift: First Results

Efficiently Mitigating the Impact of Data Drift on Machine Learning Pipelines

Autoregressive based Drift Detection Method

Unsupervised Concept Drift Detection from Deep Learning Representations in Real-time

Dealing with Drift of Adaptation Spaces in Learning-based Self-Adaptive Systems using Lifelong Self-Adaptation

A Scalable Approach to Covariate and Concept Drift Management via Adaptive Data Segmentation

Automating concept-drift detection by self-evaluating predictive model degradation

AutoML @ NeurIPS 2018 challenge: Design and Results

Online AutoML: an adaptive AutoML framework for online learning

A novel framework for concept drift detection using autoencoders for classification problems in data streams

Driftage: a multi-agent system framework for concept drift detection

Assessing Machine Learning Approaches to Address IoT Sensor Drift

Concept Drift Adaptation by Exploiting Drift Type

Human-in-the-loop Handling of Knowledge Drift

Learn to Adapt: Robust Drift Detection in Security Domain

Real-time Drift Detection on Time-series Data

An Adaptive Method for Weak Supervision with Drifting Data

Drift to Remember

An Empirical Evaluation of Meta Residual Network for Classifying Sensor Drift Samples

You are out of context!

Automatic Learning to Detect Concept Drift