Informative Instance Detection for Active Learning on Imbalanced Data

Jian Xu,Xinyue Wang,Zixin Cai,Liu Yang,Liping Jing
DOI: https://doi.org/10.1109/ijcnn.2019.8852205
2019-01-01
Abstract:In imbalanced data classification, it is hard to learn the hidden pattern from the minority class due to its insufficient information. To solve this problem, a popular type of sampling methods is proposed based on Active Learning framework, but they still suffer from two key issues: how to keep the structure of the original data and avoid imbalance during the learning process. In this paper, we proposed a novel Active Learning framework (COAL) to select and generate informative instances. To keep the structure and enhance the diversity of the original data, we make use of Clustering-based uncertainty sampling to find informative instances. Meanwhile, to avoid imbalance in the active learning process, we make use of Oversampling method to balance the quantities between classes. Extensive experiments have been conducted by using real world datasets with a large range of imbalance ratio (from 2.78 to 66.67). The experimental results show that the proposed COAL outperforms state-of-the-art methods in terms of several well-known evaluation metrics.
What problem does this paper attempt to address?