Abstract:Imbalanced data classification poses a major challenge in data mining community. Although standard support vector machine can generally show relatively robust performance in dealing with the classification problems of imbalanced data set, it is a typical overall accuracy-oriented algorithm which results in the final decision boundary biasing toward the majority class. Some ensemble methods have emerged as meta-techniques for improving the generalization performance of existing learning algorithms. In this paper, we propose a novel self-adaptive cost weights-based support vector machine cost-sensitive ensemble for imbalanced data classification. In the proposed approach, to guarantee the consistency of optimization objectives between weak learners and boosting scheme, we not only apply cost-sensitive SVMs as basic weak leaner but also simultaneously modify the standard boosting scheme to cost-sensitive ones. In order to ensure more training minority instances for successive classifiers, especially borderline minority instances, we also present a self-adaptive sequential misclassification cost weights determination method. The method can self-adaptively consider the different contribution of minority instances to the form of SVM classifiers at each iteration based on the preceding obtained classifier during boosting, which can allow it to produce diverse classifiers and thus improve its generalization performance. In the experiments, we analyze and discuss the effect of different parameters on the performance and some suggestions are also provided. The extensive experimental results on the different imbalanced datasets demonstrate that the proposed approach can achieve better generalization performance in terms of G-Mean and F-Measure as compared to the other existing imbalanced dataset classification techniques.

Comparison of Ensemble Models as Solutions for Imbalanced Class Classification of Datasets

Imbalanced Data Sets Classification Method Based on Over-Sampling Technique

A Novel Ensemble Method for Classifying Imbalanced Data

Experimental Study and Comparison of Imbalance Ensemble Classifiers with Dynamic Selection Strategy

Impact of class imbalance ratio on ensemble methods for imbalance problem: A new perspective

Multi-Class Imbalance Problem: A Multi-Objective Solution

Research and application of XGBoost in imbalanced data

Class Imbalance Problem: A Wrapper-Based Approach using Under-Sampling with Ensemble Learning

A weighted hybrid ensemble method for classifying imbalanced data

The Effect of Balancing Methods on Model Behavior in Imbalanced Classification Problems

A Survey of Methods for Managing the Classification and Solution of Data Imbalance Problem

BENN: Balanced Ensemble Neural Network for Handling Class Imbalance in Big Data

A Novel Imbalanced Data Classification Method Based on Weakly Supervised Learning for Fault Diagnosis

Balancing the Scales: A Comprehensive Study on Tackling Class Imbalance in Binary Classification

Comparing Different Oversampling Methods in Predicting Multi-Class Educational Datasets Using Machine Learning Techniques

Adaptive Subspace Optimization Ensemble Method for High-Dimensional Imbalanced Data Classification

Self-paced Ensemble for Highly Imbalanced Massive Data Classification

An Adaptive Multi-Class Imbalanced Classification Framework Based on Ensemble Methods and Deep Network

Self-adaptive cost weights-based support vector machine cost-sensitive ensemble for imbalanced data classification

A review of ensemble learning and data augmentation models for class imbalanced problems: combination, implementation and evaluation

Adaptive ensemble of classifiers with regularization for imbalanced data classification