Abstract:ML-enabled systems that are deployed in a production environment typically suffer from decaying model prediction quality through concept drift, i.e., a gradual change in the statistical characteristics of a certain real-world domain. To combat this, a simple solution is to periodically retrain ML models, which unfortunately can consume a lot of energy. One recommended tactic to improve energy efficiency is therefore to systematically monitor the level of concept drift and only retrain when it becomes unavoidable. Different methods are available to do this, but we know very little about their concrete impact on the tradeoff between accuracy and energy efficiency, as these methods also consume energy themselves.

What problem does this paper attempt to address?

The core problem that this paper attempts to solve is: in machine learning (ML) systems, how to balance accuracy and energy efficiency in concept drift detection. Specifically, the paper focuses on the impact of different concept drift detection methods on accuracy and energy consumption, aiming to provide a basis for the sustainable monitoring of ML systems. ### Problem Background When ML models are deployed in production environments, the prediction quality usually gradually declines due to concept drift. Concept drift refers to the gradual change in the statistical characteristics of real - world data. To address this issue, a simple solution is to retrain ML models regularly, but this consumes a large amount of energy. Therefore, the recommended approach is to systematically monitor the degree of concept drift and retrain the model only when necessary. However, currently, little is known about the specific impact of these monitoring methods on accuracy and energy efficiency. ### Research Objectives The paper studied the trade - off between accuracy and energy efficiency of seven commonly - used concept drift detection methods through controlled experiments. The experiments used five synthetic datasets, each including two concept drift types (abrupt and gradual), and trained six different ML models as base classifiers. Based on a full - factorial design, the experiment tested 420 combinations (7 drift detectors × 5 datasets × 2 drift types × 6 base classifiers) and compared energy consumption and drift detection accuracy. ### Main Findings 1. **Three types of detectors**: - **Sacrifice energy efficiency for high accuracy**: Such as KSWIN. - **Balanced detectors**: Consume low - to - medium energy and have good accuracy, such as HDDM W and ADWIN. - **Low - energy - consumption but unusable**: Cannot be practically applied due to extremely poor accuracy, such as HDDM A, PageHinkley, DDM, and EDDM. 2. **Impact of different dataset types**: Different types of drift (abrupt or gradual) will affect the performance of detectors. 3. **Impact of base classifiers**: Different types of base classifiers will also affect the results, especially the energy consumption when retraining the model. ### Conclusions By providing abundant evidence, the research results support ML practitioners in making informed decisions when choosing suitable concept drift detection methods, thereby optimizing the effectiveness and environmental sustainability of ML systems. In particular, in dynamic and data - evolving scenarios, these findings help improve the energy efficiency of ML systems. ### Formula Representation The formulas and data involved in the paper are presented in Markdown format to ensure the correctness and readability of the formulas. For example: - Formula for calculating the percentage difference in energy consumption: \[ \text{Difference (\%)} = \left( \frac{\text{Detector 1 Mean} - \text{Detector 2 Mean}}{\text{Detector 2 Mean}} \right) \times 100 \] - Formula for calculating Cohen’s d effect size: \[ \text{d} = \frac{\text{Mean Difference}}{\text{Pooled Standard Deviation}} \] These formulas help readers better understand the experimental results and data analysis processes.

How to Sustainably Monitor ML-Enabled Systems? Accuracy and Energy Efficiency Tradeoffs in Concept Drift Detection

A novel lifelong machine learning-based method to eliminate calibration drift in clinical prediction models.

Enhancing Model Adaptability Using Concept Drift Detection for Short-Term Load Forecast

Time to Retrain? Detecting Concept Drifts in Machine Learning Systems

Efficiently Mitigating the Impact of Data Drift on Machine Learning Pipelines

Are Concept Drift Detectors Reliable Alarming Systems? -- A Comparative Study

Towards Computational Performance Engineering for Unsupervised Concept Drift Detection -- Complexities, Benchmarking, Performance Analysis

Expert-Driven Monitoring of Operational ML Models

Automating concept-drift detection by self-evaluating predictive model degradation

A Scalable Approach to Covariate and Concept Drift Management via Adaptive Data Segmentation

On The Reliability Of Machine Learning Applications In Manufacturing Environments

LEAF: Navigating Concept Drift in Cellular Networks

On the Reliable Detection of Concept Drift from Streaming Unlabeled Data

A Model-Driven Engineering Approach for Monitoring Machine Learning Models

A survey on detecting healthcare concept drift in AI/ML models from a finance perspective

Unsupervised Concept Drift Detection from Deep Learning Representations in Real-time

Machine Learning Model Drift Detection Via Weak Data Slices

Handling Concept Drifts in Regression Problems -- the Error Intersection Approach

Detecting and Responding to Concept Drift in Business Processes

Concept Drift Mitigation in Low-Cost Air Quality Monitoring Networks

LSTMDD: an optimized LSTM-based drift detector for concept drift in dynamic cloud computing