An approach to class imbalance problem based on stacking and inverse random under sampling methods

Yuwei Zhang,Guanjun Liu,Wenjing Luan,Chungang Yan,Changjun Jiang
DOI: https://doi.org/10.1109/ICNSC.2018.8361344
2018-01-01
Abstract:Class imbalance problems are very common in real-world applications, for example, fraud detection, medical diagnosis, and anomaly detection. In this paper, we propose an approach to solve the problem based on stacking and inverse random undersampling (SIRUS). First, the method of inverse random undersampling is used to undersample the majority class samples in order to generate a large number of different training subsets. Second, a group of different component classifiers are to learn the decision boundary between the minority and the majority classes for each training subset. A stacking model is applied to separate the minority class from the majority one, where the result produced by each classifier is taken as a feature to train a meta classifier. Comparison experiments are conducted based on 17 datasets from UCI machine learning repository. Many metrics such as AUC, F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> , and G-mean illustrate the effectiveness of our approach.
What problem does this paper attempt to address?