DYCUSBoost: Adaboost-Based Imbalanced Learning Using Dynamic Clustering and Undersampling.

Lingchi Chen,Xiaoheng Deng,Hailan Shen,Congxu Zhu,Le Chang
DOI: https://doi.org/10.1109/dasc/picom/datacom/cyberscitec.2018.00045
2018-01-01
Abstract:Ensemble learning is a powerful approach to classifying imbalanced data in machine learning. Adaboost as one of Ensemble learning, which often modified to deal with imbalanced problem. However, due to the variation of sample weights in Adaboost algorithm, the distribution of datasets is not consistent for each weak classifier. As a result, feature space-based resampling fails to reflect the transformation of distribution. Aiming at this problem, this paper proposes DYCUSBoost, an Adaboost-based imbalanced learning approach using dynamic clustering and undersampling. In DYCUSBoost, the clustering process is synchronized with the iteration of Adaboost, where clusters formed in different periods of Adaboost are adjusted, which make DYCUSBoost grasp the transformation of the distribution. The undersampling method assesses the importance of each cluster, and make important ones collect more samples. Through experimental verification, DYCUSBoost demonstrates desirable performance in terms of commonly-accepted evaluating metrics, e.g., AUC, G-Mean, F-Measure, etc. Moreover, the prediction stability of DYCUSBoost outperforms most undersampling methods.
What problem does this paper attempt to address?