Abstract:Imbalanced data classification poses a major challenge in data mining community. Although standard support vector machine can generally show relatively robust performance in dealing with the classification problems of imbalanced data set, it is a typical overall accuracy-oriented algorithm which results in the final decision boundary biasing toward the majority class. Some ensemble methods have emerged as meta-techniques for improving the generalization performance of existing learning algorithms. In this paper, we propose a novel self-adaptive cost weights-based support vector machine cost-sensitive ensemble for imbalanced data classification. In the proposed approach, to guarantee the consistency of optimization objectives between weak learners and boosting scheme, we not only apply cost-sensitive SVMs as basic weak leaner but also simultaneously modify the standard boosting scheme to cost-sensitive ones. In order to ensure more training minority instances for successive classifiers, especially borderline minority instances, we also present a self-adaptive sequential misclassification cost weights determination method. The method can self-adaptively consider the different contribution of minority instances to the form of SVM classifiers at each iteration based on the preceding obtained classifier during boosting, which can allow it to produce diverse classifiers and thus improve its generalization performance. In the experiments, we analyze and discuss the effect of different parameters on the performance and some suggestions are also provided. The extensive experimental results on the different imbalanced datasets demonstrate that the proposed approach can achieve better generalization performance in terms of G-Mean and F-Measure as compared to the other existing imbalanced dataset classification techniques.

Mining Knowledge from Unbalanced Data Based on Ν-Support Vector Machine

Mining Knowledge from Unbalanced Data: Effect of Class Distribution on SVM Classification

An Unbalanced Dataset Classification Approach Based On V-Support Vector Machine

A Novel Svm Modeling Approach For Highly Imbalanced And Overlapping Classification

Imbalanced Data Sets Classification Method Based on Over-Sampling Technique

SVM-based Cost Sensitive Mining

A Weighted Support Vector Machine Method and Its Application

Imbalanced Data Classification Algorithm Based on Integrated Sampling and Ensemble Learning.

Using Support Vector Machines for Mining Regression Classes in Large Data Sets

Hybrid SVM algorithm oriented to classifying imbalanced datasets

Towards Deeper Insights into Deep Learning from Imbalanced Data.

A Classfication Method For Imbalance Data Set Based on Kernel SMOTE

Improved SVM algorithm for imbalanced dataset classification

A Novel Imbalanced Data Classification Method Based on Weakly Supervised Learning for Fault Diagnosis

Learning algorithm with non-balanced data for computer-aided diagnosis of breast cancer

SVM-Based Cost-sensitive Classification Algorithm with Error Cost and Class-dependent Reject Cost

Improvement of Support Vector Machine Algorithm in Big Data Background

Solving One-Class Problem with Outlier Examples by Svm

Deep Learning-Based Imbalanced Classification With Fuzzy Support Vector Machine

SVM-SVDD: a new method to solve data description problem with negative examples

Self-adaptive cost weights-based support vector machine cost-sensitive ensemble for imbalanced data classification