Effective Sample Synthesizing in Kernel Space for Imbalanced Classification

Wenwen Mo,Lianghua He,Yuqin Wang,Jian Lu
DOI: https://doi.org/10.1109/SMC.2018.00083
2018-01-01
Abstract:Imbalance data are common and result in major challenges when classifying big data. Researchers have found that synthetic samples in the kernel space are better than those in the original data space for imbalanced classification. Although previous kernel-based methods can be defective when generating samples far from the classification boundary, they have limited contributions to revising classification boundary. Therefore, a robust synthetic method is proposed in this paper to generate an effective sample based on three real samples with given constraints. Furthermore, we research how to determine the stopping criteria. The hyperplane is reconstructed on the condition of real samples and synthetic samples, and a series of experiments is designed to test the validity, robustness and effectiveness of the synthetic samples. All the experimental results show that the proposed method is feasible and effective, especially when the dataset is heavily imbalanced.
What problem does this paper attempt to address?