Comparison of Ensemble Models as Solutions for Imbalanced Class Classification of Datasets

Yoga Pristyanto,A. F. Nugraha,Ibnu Hadi Purwanto,Mulia Sulistiyono,Rifda Faticha Alfa Aziza,Akhmad Dahlan
DOI: https://doi.org/10.1109/IMCOM56909.2023.10035615
2023-01-03
Abstract:A phenomenon known as “class imbalance” occurs when an excessive number of classes are evaluated in relation to other classes. This circumstance is quite common in the challenges that classification modeling is used to in the actual world. Because of the influence of class imbalance on the dataset, the classification model's performance is not at its highest possible level. In addition, the presence of these factors might make the possibility of incorrect categorization greater. Utilizing an ensemble model is one approach that may be used to resolve this issue. The originality of the dataset is preserved, which is one of the many benefits of this method. In this work, three different types of ensemble models-XGBoost, Stacking, and Bagging-were examined and contrasted. All three were put through their paces using five distinct unbalanced multiclass datasets, each with a different value for the imbalanced ratio. The results of the three experiments that used five different assessment indicators reveal that the XGBoost model performs much better than the Bagging and Stacking models when it comes to overall performance. The XGBoost model performs exceptionally well in all of the indicators that were evaluated, including Balanced Accuracy, True Positive Rate, True Negative Rate, Geometric Mean, and Multiclass Area Under Curve. These findings provide more evidence that XGBoost is a viable option for addressing multiclass unbalanced issues in datasets.
Computer Science
What problem does this paper attempt to address?