Entropy-based concept drift detection in information systems

Yingying Sun,Jusheng Mi,Chenxia Jin
DOI: https://doi.org/10.1016/j.knosys.2024.111596
IF: 8.139
2024-04-01
Knowledge-Based Systems
Abstract:As time passes, the data within information systems may continuously evolve, causing the target concept to drift. To ensure the effectiveness of data-driven decision making, it is crucial to detect drift in a timely manner and gather relevant information. In this paper, we introduce two methods that can directly detect concept drift in the provided information system, by considering a new perspective on uncertainty. First, using entropy under a single attribute constraint, we define the uncertainty of the target concept in an information system. By integrating the uncertainty of each attribute, the overall uncertainty of the target concept in the information system is obtained. Subsequently, two concept drift detection methods are proposed, namely EBTBM (Entropy-Based Threshold-Based Method) and EBSBM (Entropy-Based Sampling-Based Method). These methods utilize the defined uncertainty of the target concept as a statistical measure of the difference between two data blocks. Finally, extensive experiments on artificial and real-world data sets are conducted to validate the effectiveness of the proposed concept drift detection methods.
computer science, artificial intelligence
What problem does this paper attempt to address?