ISMOTE: A More Accurate Alternative for SMOTE

Jiuxiang Song,Jizhong Liu
DOI: https://doi.org/10.1007/s11063-024-11695-w
IF: 2.565
2024-10-05
Neural Processing Letters
Abstract:Classification models trained on imbalanced datasets tend to be biased towards the majority category, resulting in reduced accuracy for minority categories. A common approach to address this problem is to generate artificial data for underrepresented categories. The Synthetic Minority Over-sampling Technique (SMOTE) algorithm and its variants are widely used for this purpose. In this paper, we propose a modification to the data generation mechanism called Iteration-based SMOTE (ISMOTE). Unlike SMOTE, the ISMOTE algorithm trains the data for multiple iterations. In each iteration, the model generates new samples in the vicinity of appropriately misclassified data. These new samples are then fed into the classification model, thus improving classification accuracy over the course of multiple iterations. We compare the performance of ISMOTE with SMOTE and other commonly used oversampling algorithms. Our empirical results demonstrate that ISMOTE significantly improves the quality of the generated data compared to other oversampling methods. Additionally, we conduct experiments to verify the effect of parameters on the model and provide suggestions for choosing appropriate values to improve performance.
computer science, artificial intelligence
What problem does this paper attempt to address?