CSAL: Cost sensitive active learning for multi-source drifting stream

Hang Zhang,Weike Liu,Hao Yang,Yun Zhou,Cheng Zhu,Weiming Zhang
DOI: https://doi.org/10.1016/j.knosys.2023.110771
IF: 8.139
2023-07-07
Knowledge-Based Systems
Abstract:Multi-source stream classification is a prominent real-world problem challenged by the limited real labels and non-stationary environment. Despite growing research achievements in this field, most existing works solved this problem by requiring all real labels of source or target streams to conduct domain adaption or transfer learning mechanisms, which brings high labeling costs. However, in real-world applications, there are usually insufficient labeled data in both source and target streams, and the annotation cost of the source and target streams are generally unequal. Thus, we propose a Cost-Sensitive Active Learning (CSAL) method for multi-source drifting streams. Specifically, a multi-source ensemble framework with an asymmetry weighting mechanism is presented to ensure beneficial knowledge transfer and avoid the negative transfer. Then, a multi-perspective similarity estimation method is proposed to evaluate the similarity of source and target streams. On this basis, a novel cost-sensitive hybrid labeling strategy that combines volatility strategy and uncertainty strategy with a cost-sensitive budget control mechanism is proposed, which adaptively selects representative samples at the appropriate time. At last, a parallel multiple hypothesis drift detection method is proposed, which can efficiently utilize real labels to detect concept drift. Experimental results on real-world and synthetic data streams show that our CSAL outperforms the state-of-the-art methods with even fewer labels.
computer science, artificial intelligence
What problem does this paper attempt to address?